Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-27-2020 10:11 AM
Hi,
I am trying to add new edges between nodes which have paths of length 2. This is what I did:
Match path=((a:person)-[*2]-(b:person))
With a, b, Count(path) as weight
Merge (a)-[e:co_authors]->(b)
Set e.weight=weight
The number of person nodes I have in my database is 100001 and I found that the number of such paths of length 2 between Person nodes is 37817286.
I get a out of memory error:
Neo.TransientError.General.OutOfMemoryError: There is not enough memory to perform the current task. Please try increasing 'dbms.memory.heap.max_size' in the neo4j configuration (normally in 'conf/neo4j.conf' or, if you you are using Neo4j Desktop, found through the user interface) or if you are running an embedded installation increase the heap by using '-Xmx' command line flag, and then restart the database.
How do I fix the memory heap size?
Thanks,
Lavanya
Please keep the following things in mind:
Please format code + Cypher statements with the code </>
icon, it's much easier to read.
Please provide the following information if you ran into a more serious issue:
PROFILE
or EXPLAIN
with boxes expanded (lower right corner)01-27-2020 10:20 AM
You'll likely want to batch your writes via APOC Procedures
You may also want to add a predicate to prevent mirrored results (which would result in two relationships being created per pairing).
For example, maybe something like this:
CALL apoc.periodic.iterate("MATCH (a:person) RETURN a",
"MATCH path = (a)-[*2]-(b:person)
WHERE id(a) < id(b)
WITH a, b, count(path) as weight
MERGE (a)-[e:co_authors]->(b)
SET e.weight=weight", {batchSize:5000}) YIELD batches, total, errorMessages
RETURN batches, total, errorMessages
This will process in batches of 5000 persons at a time, though you may need to adjust your batchSize, depending on average number of coauthor relationships you expect per person.
01-28-2020 12:39 PM
Thanks for the response. I checked the code on my big graph and realised that (since there are no multiple edges)
CALL apoc.periodic.iterate("MATCH (a:person) RETURN a",
"MATCH path = (a)-[*2]-(b:person)
WHERE id(a) < id(b)
WITH a, b, length(path) as pathlength
MERGE (a)-[e:co_authors]->(b)
SET e.weight=pathlength", {batchSize:5000}) YIELD batches, total, errorMessages
RETURN batches, total, errorMessages
add one edge between persons for each path of length 2 between persons - This is what I wanted, although I posed the question differently:
add one edge between persons if there is a path of length 2 between persons.
01-28-2020 12:44 PM
I think there's something wrong with that query. You have: WITH a, b, length(path) as pathlength
, but because you're using *2 for your var-length pattern, the length will always be 2.
Note in the previous version of the query you were using count(path)
, which is the number of paths found between the two nodes. This is also an aggregation function, meaning the non-aggregation variables become distinct, which would fix your cardinality problem (when using count(path)
, you will only ever get 1 row between an a and b node).
If length(path)
is really how you want to calculate the weight, then you will need a different way to ensure a and b are distinct:
WITH DISTINCT a, b, length(path) as pathlength
01-28-2020 12:49 PM
I am trying to replace each path of length 2 as an edge. So I will have multiple rows with same a and b. That's why I am skipping the word "DISTINCT"
01-28-2020 12:51 PM
Ah, I misread your last update then, my mistake.
01-28-2020 01:07 PM
I think there is still an issue with
MATCH path = (a:person)-[*2]-(b:person)
WHERE id(a) < id(b)
WITH a, b, path, length(path) as pathlength
MERGE (a)-[e:co_authors]->(b)
SET e.weight=pathlength
since it is still not creating a unique edge for each path of length 2.
01-28-2020 01:11 PM
Ah, you need to use CREATE instead of MERGE for this, otherwise it will find and use the existing relationship and not create a new one.
Looks like you found that a second before me, looks like you're all set!
01-28-2020 01:10 PM
got it now at last with:
MATCH path = (a:person)-[*2]-(b:person)
WHERE id(a) < id(b)
WITH a, b, path, length(path) as pathlength
CREATE (a)-[e:co_authors]->(b)
SET e.weight=pathlength
world of difference between "merge" and "create"
All the sessions of the conference are now available online