Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-01-2020 01:03 AM
Hey there guys. I'm trying to import data from a csv file with about 300K rows. The query I am using is as follows:
CALL apoc.periodic.iterate(
'CALL apoc.load.csv("file:///newfile.csv") Yield map as row return row'
,
'MERGE (s:Sender{from_send:row.From})
MERGE (r:Receiver{to_send:row.To})
MERGE (s)-[e:EMAILED{date: row.Date, time:row.Time, subject:row.Subject, message_id : row.MessageID}]->(r)'
,
{batchSize:20000, iterateList:True, parallel:true}
)
But this query seems to be taking forever.(looks like its running in an infinite loop- kept running for 15 min)
Note that there are blank values in the "Subject" column in the csv as well. Can anyone please tell me what's the problem. Thanks
06-01-2020 01:36 AM
Try with parallel:false
and ensure you have an index or unique constraint on Sender
and Receiver
.
06-01-2020 04:25 AM
should there be unique constraints on sender/receiver since a single sender can send emails to multiple recipients? Please correct me if I'm wrong or couldn't understand your point.
Thank you
06-01-2020 05:43 AM
Any node being used in a MERGE
should be backed by an index or unique constraint. Otherwise the MERGE will get massivly slower with data growth.
All the sessions of the conference are now available online