Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-15-2019 04:26 AM
So I am trying to import 26 million rows in around 300 csv files (using bash to execute the bulk import code).
I come across a limitation in the number of csv's I can reference before I'm told , that the command is too long...
And some of those csv's have null values in columns.. in the normal load csv , I know how to deal with those, but with the bulk import (neo4j-admin.bat) I do not.
any help would be appreciated
Solved! Go to Solution.
01-15-2019 08:20 AM
01-15-2019 06:07 AM
You can use regular expressions for the files
e.g. --nodes:Person file-[0-9]+.csv.gz
note that in regexp you need to use .*
instead of *
for "any character"
the null values are skipped during import
if the --ignore-empty-strings
setting is set to true:
--ignore-empty-strings <true/false>
Whether or not empty string fields, i.e. "" from input source are ignored, i.e.
treated as null. Default value: false
01-15-2019 06:08 AM
You are a kind man for answering one of my questions again Michael.. 🙂 Thanks.. I'll give it a shot right now. (that helps me having to adjust the data export out of oracle that I'm thought I'd have to labour through).
01-15-2019 06:10 AM
It's best to try it out with a small subset first and only run the big one after all the kinks have been sorted out.
Saves a lot of waiting time 🙂
Good luck
01-15-2019 06:12 AM
what if I have different files per node?
ie. --nodes:c_contracts:c_payments ......
and each node is named after a csv file?
01-15-2019 06:13 AM
Then you use multiple --nodes
parameters.
see the documentation:
https://neo4j.com/docs/operations-manual/current/tutorial/import-tool/
01-15-2019 06:14 AM
Hmmm the problem is that I reach the limitation in bash for characters, when I have 111 tables to import from Oracle.... (using the multiple lines)...
01-15-2019 06:17 AM
I don't know of such a limit in bash. Did you use teh regexps for the files?
And you can put all the command line options into a file too:
--f <file name>
File containing all arguments, used as an alternative to supplying all arguments
on the command line directly.Each argument can be on a separate line or multiple
arguments per line separated by space.Arguments containing spaces needs to be
quoted.Supplying other arguments in addition to this file argument is not
supported.
01-15-2019 06:34 AM
using the command --ignore-empty-strings=true and i get the message : unrecognized option:'ignore-empty-strings'
... This is also not mentioned in the documentation (which I've taken a look at again as I didn't see before posting the original post ... )
01-15-2019 08:20 AM
Oh sorry, I'm always using neo4j-import
not neo4j-admin import
01-15-2019 09:46 AM
Ah I see.
Thats a depreciated feature and therefore no longer documentated on neo4j (though in some blogs)
It works on a test example for me , so yeah, that fixes (till you get rid of the neo4j-import , my null value on bulk ) and the import using the file fixes my other issue.
Awesome stuff.
All the sessions of the conference are now available online