cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

How to avoid error while loading a large CSV

Neo4j server Community edition 4.4.6 , running on Ubuntu 20.04

Well, I've just solved a big problem, being able to load a quite large number of transactions using LOAD CSV.

I dumped a table from Postgres, a table running fine without problems, but when I go to LOAD into neo4j it returns me some errors. Often, there is only an unpaired ' or " or a spare \n.

As they are millions rows, It's really difficult to understand where the errors are hidden, expecially because neo4j returns the offending position and rot the offending row.

I created a shell script to split the input file in chunks of 10.000 rows and then executing a CALL {} IN TRANSACTION OF 5000 rows, in this way, if I find an error, I loose only 10.000 rows and not all the millions.

I think should be nice if the LOAD CSV shoul be able to reject in an error.log the offending lines, without stopping its run.

Anyway has an idea?

Thank you

2 REPLIES 2

Please try with apoc.load.csv function available in Neo4j. to use this 1st we need to add apoc plugin.

CALL apoc.load.csv("MY_CSV_URL", {failOnError:false})
YIELD list, map
WITH list, map
call apoc.do.when(list = , "return 'nothingToList' as list, 'nothingToMap' as map", "return list, map", {list: list, map: map})
YIELD value
RETURN value["list"], value["map"]

for more info you can refer below