Neo4j

ahmedfazal405 · ‎06-01-2020

Hey there guys. I'm trying to import data from a csv file with about 300K rows. The query I am using is as follows:

CALL apoc.periodic.iterate(

'CALL apoc.load.csv("file:///newfile.csv") Yield map as row return row'
,
'MERGE (s:Sender{from_send:row.From}) 
MERGE (r:Receiver{to_send:row.To}) 
MERGE (s)-[e:EMAILED{date: row.Date, time:row.Time, subject:row.Subject, message_id : row.MessageID}]->(r)'
,
{batchSize:20000, iterateList:True, parallel:true}
)

But this query seems to be taking forever.(looks like its running in an infinite loop- kept running for 15 min)

Note that there are blank values in the "Subject" column in the csv as well. Can anyone please tell me what's the problem. Thanks

stefan_armbrust · ‎06-01-2020

Try with parallel:false and ensure you have an index or unique constraint on Sender and Receiver.

ahmedfazal405 · ‎06-01-2020

should there be unique constraints on sender/receiver since a single sender can send emails to multiple recipients? Please correct me if I'm wrong or couldn't understand your point.
Thank you

stefan_armbrust · ‎06-01-2020

Any node being used in a MERGE should be backed by an index or unique constraint. Otherwise the MERGE will get massivly slower with data growth.

Neo4j

Bulk Import Taking too long