Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-18-2020 01:19 AM
Hey guys. I am using the following code to load a csv file and parse the data abnd time columns.
CALL apoc.periodic.iterate(
'CALL apoc.load.csv("file:///newfile.csv") yield map as row'
,
'MERGE (s:Sender{from_send:row.From})
MERGE (r:Receiver{to_send:row.To})
MERGE (s)-[e:EMAILED {
date_d:datetime({epochMillis:apoc.date.parse(row.Date,'ms','dd/MM/yyyy')}),
time_d:time(datetime({epochMillis:apoc.date.parse(row.Time,'ms','hh:mm:ss')})),
subject:row.Subject, message_id : row.MessageID}]->(r)'
,
{batchSize:10000, iterateList:True, parallel:false}
)
but Im getting a syntax error,I can't figure out what Im doing wrong. Can anyone please help out. Thanks
NOTE, the "Date" column is in the format "08/06/2020"
while the "Time"column is in the form "23:59:52"
06-19-2020 12:13 PM
Initial thoughts when looking at your script you are running is you need to use different quotes for your strings in the script and the quotes that are around your entire script. I would suggest changing the quotes that are around the entire cypher script to double quotes and the ones around the strings to single quotes (I like using double quotes around my whole script, you could do it the other way if you prefer). This changes the script to look like this...
CALL apoc.periodic.iterate(
"CALL apoc.load.csv('file:///newfile.csv') yield map as row",
"MERGE (s:Sender{from_send:row.From})
MERGE (r:Receiver{to_send:row.To})
MERGE (s)-[e:EMAILED {
date_d:datetime({epochMillis:apoc.date.parse(row.Date,'ms','dd/MM/yyyy')}),
time_d:time(datetime({epochMillis:apoc.date.parse(row.Time,'ms','hh:mm:ss')})),
subject:row.Subject, message_id : row.MessageID}]->(r)",
{batchSize:10000, iterateList:True, parallel:false}
)
Let me know if this does not fix the issue.
06-23-2020 03:36 AM
Thank you for the answer. The query seems to load and throws no errors, but the execution is taking too long. It still hadn't completed in 4+ hours. The previous file (300k rows) completed in about an hour. This file however contains (700k rows). Can you also guide as to what else should I do? I have already increased the heap size and reduced the batch size further.
06-23-2020 08:48 AM
Hello @ahmedfazal405
I think your query is taking time because of the date and time conversion. Did you try to load without the parsing?
Regards,
Cobra
06-23-2020 10:53 AM
@Cobra I was wondering about the parsing of the date and time. Do you have another way to get this data properly from the csv to the database as a datetime without the parsing?
06-23-2020 11:54 AM
Hey @Cobra
I did load the previous csv file (300k rows) without parsing the date and time. But I require queries which would be able to handle the date, time columns in their appropriate format and not as "Strings". Is there another way around if this if taking too much time. Can these columns be parsed later on?
06-23-2020 12:22 PM
If possible, please send me the actual value for the date as in your .csv file.
06-23-2020 01:11 PM
To be honest, I avoid to format my data in Cypher, I always format the data in Python and after I laod them
You can do this in a few seconds and a few lines of code in Python
All the sessions of the conference are now available online