cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Correct way to ingest millions of results?

Rogie
Node Link

I'm having trouble debugging what's going on in my workflow:

from neo4j import GraphDatabase
def get_results(db):
    q = " ... my query ..."
    driver = GraphDatabase.driver(uri, auth=("neo4j", "pass"))
    db = driver.session(
    with db.begin_transaction() as tx:
        r = tx.run(q)
        tx.success = True
        for r in res:
            process_res(r)

The for loop seems to randomly hang after processing a a few hundred thousand results. My process_res() function is simple enough that I don't think it's causing any problems.

Is this the correct way to ingest millions of results, or is there a better way?

2 REPLIES 2

You should take care regarding transactions sizes. Typically 10 - 100k atomic operations (like creating a node, setting a property) are a good tx size. If you're way above that you might exhaust transaction state memory.

Either use client side transaction batching or take a look at apoc.periodic.iterate doing this on the neo4j server itself.

Hello @Rogie

I wrote a little example to load data in your database, you can adapt it a bit to use it

Regards,
Cobra