Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
09-13-2019 02:54 AM
I filled a database with nodes (CREATE) through a python script using neo4j.GraphDatabase.
Every 1000 entries I finished the session and transaction and began a new one.
This worked fine for a continuous set of ~30.000 nodes.
Then I tried it on a much larger dataset.
After ~1.5 million nodes the python script did not go forward anymore. Checking the task manager showed Java running with 100% cpu.
As of my understanding this should not happen(?)
My code looks about like this:
driver = GraphDatabase.driver(uri, auth=(user, password))
repeat_3000_times:
session = driver.session()
transaction = session.begin_transaction()
transaction.run("CREATE ( ... )") # 1000 times
transaction.commit()
session.close()
Is this a memory issue within Neo4j?
Solved! Go to Solution.
09-14-2019 02:43 AM
agreed with Dave on the config, please share what you have configured for your server.
if would be good if you used parameters, e.g. a list of dicts for your data
and then use
UNWIND $params AS row
CREATE (n:Something) n += row
09-13-2019 03:45 AM
What is your heap configuration.
You should not create a new session 3000 times.
Just create a single session and use 3000 transactions.
09-13-2019 04:00 AM
Batching also helps, if you are able. Doing an UNWIND of a batch of data (such as 10k or so at a time) and processing the entire batch per transaction rather than a single create per transaction will be more efficient.
09-14-2019 02:43 AM
agreed with Dave on the config, please share what you have configured for your server.
if would be good if you used parameters, e.g. a list of dicts for your data
and then use
UNWIND $params AS row
CREATE (n:Something) n += row
09-17-2019 04:29 AM
Thank you for your quick feedback.
I did not set any parameters. I used a fresh installation, created a database and started inserting.
I will try to use only a single session and see, if things go better.
I will also try using UNWIND. Thank you for the hint and the link!
09-29-2019 01:43 AM
(1) Creating only one session with several transaction let me insert all data into the database.
(2) UNWIND: This speeded up the insertion process by the factor of 5. Very nice hint.
All the sessions of the conference are now available online