Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-14-2019 05:11 PM
I am trying to unwind a list of files to do a periodic iterate import and merging of data. See my code below:
CALL apoc.periodic.iterate("
UNWIND [
'file:///AK14.csv',
'file:///AL14.csv',
'file:///AR14.csv',
'file:///AZ14.csv',
'file:///CA14.csv',
'file:///CO14.csv',
'file:///CT14.csv',
'file:///DC14.csv',
'file:///DE14.csv',
'file:///FL14.csv',
'file:///GA14.csv',
'file:///HI14.csv',
'file:///IA14.csv',
'file:///ID14.csv',
'file:///IL14.csv',
'file:///IN14.csv',
'file:///KS14.csv',
'file:///LA14.csv',
'file:///MA14.csv',
'file:///MD14.csv',
'file:///ME14.csv',
'file:///MI14.csv',
'file:///MN14.csv',
'file:///MO14.csv',
'file:///MS14.csv',
'file:///MT14.csv',
'file:///NC14.csv',
'file:///ND14.csv',
'file:///NE14.csv',
'file:///NH14.csv',
'file:///NJ14.csv',
'file:///NM14.csv',
'file:///AK14.csv',
'file:///NV14.csv',
'file:///NY14.csv',
'file:///OH14.csv',
'file:///OK14.csv',
'file:///OR14.csv',
'file:///PA14.csv',
'file:///PR14.csv',
'file:///RI14.csv',
'file:///SC14.csv',
'file:///SD14.csv',
'file:///TN14.csv',
'file:///TX14.csv',
'file:///UT14.csv',
'file:///VA14.csv',
'file:///VT14.csv',
'file:///WA14.csv',
'file:///WI14.csv',
'file:///WV14.csv',
'file:///WY14.csv'
] AS file
LOAD CSV WITH HEADERS FROM file AS row RETURN row",
"
MERGE (bridge:Bridge {id: row.STRUCTURE_NUMBER_008})
MERGE (place:Place {id: row.PLACE_CODE_004})
MERGE (county:County {id: row.COUNTY_CODE_003})
MERGE (state:State {id: row.STATE_CODE_001})
MERGE (owner:Owner {id: row.OWNER_022})
MERGE (maintResp:MaintenanceResp {id: row.MAINTENANCE_021})
MERGE (bridge)-[:OF_PLACE]->(place)
MERGE (place)-[:OF_COUNTY]->(county)
MERGE (county)-[:OF_STATE]->(state)
MERGE (bridge)-[:OWNED_BY]->(owner)
MERGE (bridge)-[:MAINTAINED_BY]->(maintResp)
ON CREATE SET bridge.name = row.STRUCTURE_NUMBER_008,
bridge.latitude = row.LAT_016,
bridge.longitude = row.LONG_017,
bridge.yearbuilt = row.YEAR_BUILT_027,
place.name = row.PLACE_CODE_004,
county.name = row.COUNTY_CODE_003,
state.name = row.STATE_CODE_001,
owner.name = row.OWNER_022,
maintResp.name = row.MAINTENANCE_021
",
{batchSize:1000,iterateList:true})
YIELD batches, total
RETURN batches, total
This is a process I will have to repeat for many more files like this. I tried letting this run last night and it crashed. I have adjusted dbms.memory.heap.initial_size=1G
and dbms.memory.heap.max_size=2G
.
My thought is that I should pull UNWIND
outside of the apoc.periodic.iterate
, but I have gotten stuck there so far.
Solved! Go to Solution.
02-14-2019 06:41 PM
Ok. I forgot to create indices on my nodes (this is the largest data set I have been working with):
CREATE INDEX ON :Bridge(id);
CREATE INDEX ON :Place(id);
CREATE INDEX ON :County(id);
CREATE INDEX ON :State(id);
CREATE INDEX ON :Owner(id);
CREATE INDEX ON :MaintenanceResp(id);
I was then able to run my import without issue.
02-14-2019 06:41 PM
Ok. I forgot to create indices on my nodes (this is the largest data set I have been working with):
CREATE INDEX ON :Bridge(id);
CREATE INDEX ON :Place(id);
CREATE INDEX ON :County(id);
CREATE INDEX ON :State(id);
CREATE INDEX ON :Owner(id);
CREATE INDEX ON :MaintenanceResp(id);
I was then able to run my import without issue.
All the sessions of the conference are now available online