Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
03-08-2022 04:18 AM
Hi,
I have a database with 20 million nodes and 10 million relationships. I want to merge nodes that have the same code number property.
My cypher is like this
CALL apoc.periodic.iterate("
MATCH (n:Person) with distinct n.code as props return props
","
UNWIND props as prop
CALL{
WITH prop
MATCH (n:Person {code:prop})
with COLLECT(n) AS ns, count(n) as cn where cn > 1
CALL apoc.refactor.mergeNodes(ns, {properties:{OtherCodes:'combine', `.*`: 'overwrite'}})
YIELD node RETURN node as s
}WITH s
RETURN s;
", {batchSize:10000, parallel:true, iterateList:true});
But it does nothing, does not exist any errors, but it does not process
I use the 4.3.6 neo4j version
Solved! Go to Solution.
03-10-2022 12:33 PM
Remove the unwind from your 2nd statement.
I presume you have an index on :Person(code) ?
You don't need the subquery.
How many people with the same code do you have 10, 100, 10000 ?
CALL apoc.periodic.iterate("
MATCH (n:Person) return distinct n.code as prop
","
MATCH (n:Person {code:prop})
with prop, COLLECT(n) AS ns, count(n) as cn where cn > 1
CALL apoc.refactor.mergeNodes(ns, {properties:{OtherCodes:'combine', `.*`: 'overwrite'}})
YIELD node
RETURN count(*)
", {batchSize:100, parallel:true, iterateList:true});```
03-10-2022 12:33 PM
Remove the unwind from your 2nd statement.
I presume you have an index on :Person(code) ?
You don't need the subquery.
How many people with the same code do you have 10, 100, 10000 ?
CALL apoc.periodic.iterate("
MATCH (n:Person) return distinct n.code as prop
","
MATCH (n:Person {code:prop})
with prop, COLLECT(n) AS ns, count(n) as cn where cn > 1
CALL apoc.refactor.mergeNodes(ns, {properties:{OtherCodes:'combine', `.*`: 'overwrite'}})
YIELD node
RETURN count(*)
", {batchSize:100, parallel:true, iterateList:true});```
03-13-2022 05:31 AM
Thanks for your answer
yes I have an index on Person(code)
And the number of people with the same code is about 100,000
03-13-2022 05:48 AM
I have a similar problem with relationships.
Because of the error "All Relationships must have the same start and end nodes.", I wrote a function in my plugin to categorize relationships by start and end, and it works fine.
My cypher is like this
CALL apoc.periodic.iterate("
MATCH (s:Person)-[r:Work]-(t:Office) WITH COLLECT(r) as lrs RETURN lrs
","
with customPlugin.relations.groupByStartAndEnd(lrs) as grs
UNWIND grs as gr
CALL apoc.refactor.mergeRelationships(gr) YIELD rel RETURN rel
", {batchSize:500, parallel:true, iterateList:true});
But when use apoc.refactor.mergeRelationships
after a while, I see the following error in the logs
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "neo4j.Scheduler-1"
All the sessions of the conference are now available online