cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Optimzing one time bulk data load in a single/multi node instance

We are trying to do a one time bulk data load as part of our go-live.

We are using Databricks to transform data from source DB to Neo4j format with a batch size of 20k.

Initially we tried with a cluster wherein there were 3 core nodes and 3 read replicas.

The load failed after a certain point as a result of which we moved our cluster to a single core node instance but still it took ~9 hours to load about 240 million relationships. Our JVM is 31g each and its running on a 16core box.

Question: What are some recommendations to expedite data load? Thanks in advance.

0 REPLIES 0