Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-06-2021 04:07 AM
I'm getting the following out of memory error when running neo4j queries in Python. I'm using neo4j 4.1.0 desktop
.
neo4j.exceptions.ClientError: {code: Neo.ClientError.Procedure.ProcedureCallFailed} {message: Failed to invoke procedure `gds.alpha.shortestPath.deltaStepping.stream`: Caused by: java.lang.OutOfMemoryError: Java heap space}
I've followed the instructions to change the memory available: The neo4j.conf file - Operations Manual and assigned 12GB to the relevant parameters in the conf file:
dbms.memory.heap.initial_size=12g
dbms.memory.heap.max_size=12g
dbms.memory.pagecache.size=12g
My database has 63,000 nodes and 57,000 relationships
My python code looks like this and is called in a loop, with the id
value changing each time:
neo4j_session = neo4j_driver.session()
results_data = neo4j_session.run("MATCH (start:Person {id: 21) \
CALL gds.alpha.shortestPath.deltaStepping.stream({ \
nodeQuery:'MATCH(n:Person) RETURN id(n) AS id', \
relationshipQuery:'MATCH (p1:Person {id: 21})-[p1Knows:KNOWS]->(p1s)-[r:IS_MEMBER_OF*..10]-(p2s)<-[p2Knows:KNOWS]-(p2:Person) WHERE p1.id <> p2.id and p1Knows.self_rating <> 0 and p1Knows.self_rating < p2Knows.self_rating with p1, p2, reduce(cost = 0, x IN r | cost + coalesce(x.distance, 0)) as cost RETURN id(p1) AS source, id(p2) AS target, cost AS weight', \
startNode: start, \
relationshipWeightProperty: 'weight', \
delta: 3.0, \
writeProperty: 'sssp' \
}) \
YIELD nodeId, distance \
where gds.util.isFinite(distance) \
with nodeId, gds.util.asNode(nodeId) as n, distance \
RETURN n.name AS Name, distance AS Cost \
ORDER BY Cost".format(person_id)).data()
neo4j_session.close()
The error doesn't occur on the same id each time, so I'm wondering if I'm not using the python driver correctly and not clearing something up?
If not, do I really need 12GB of memory to query the graph?
What's the best way to go about diagnosing the issue?
Solved! Go to Solution.
01-06-2021 03:48 PM
First, I would say, try your query directly in the Neo4j browser first to figure out if it's from your code.
Elimination is always the way to proceed when we have a technical problem.
Second, your quantity of data is tiny from the computer view, but I'm worry about the way your query is built. There is a lot of MATCH here and each of them could be compute as a Cartesian product.
Meaning that your data might be small but the result of your query will be insanely huge.
I don't know well the gds.alpha.shortestPath.deltaStepping.stream function, but I would suggest to use the clause EXPLAIN at the beginning of your query in neo4j desktop to see the plan of your query in the plan tab. After you might do the same with the PROFILE clause to see how it goes when you execute it.
01-06-2021 03:48 PM
First, I would say, try your query directly in the Neo4j browser first to figure out if it's from your code.
Elimination is always the way to proceed when we have a technical problem.
Second, your quantity of data is tiny from the computer view, but I'm worry about the way your query is built. There is a lot of MATCH here and each of them could be compute as a Cartesian product.
Meaning that your data might be small but the result of your query will be insanely huge.
I don't know well the gds.alpha.shortestPath.deltaStepping.stream function, but I would suggest to use the clause EXPLAIN at the beginning of your query in neo4j desktop to see the plan of your query in the plan tab. After you might do the same with the PROFILE clause to see how it goes when you execute it.
All the sessions of the conference are now available online