Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-20-2021 01:28 PM
I have been importing some large corporate data that has a lot of duplicates. I can identify the duplicates and want to merge them,
:auto USING PERIODIC COMMIT 2000
LOAD CSV WITH HEADERS FROM 'file:///BAHAMAS_MERGE_1.csv' AS row
MATCH (a1:PanamaPapers {panamaID: row.panamaID}), (a2:BahamasLeaks {bahamasID: row.bahamasID})
WITH head(collect([a1,a2])) as nodes
CALL apoc.refactor.mergeNodes(nodes,{
properties:"combine",
mergeRels:true
})
YIELD node
RETURN count(*)
I receive the error Cannot delete node<20202439>, because it still has relationships. To delete this node, you must first delete its relationships.
Any way to optimize this so I do not run into this issue? When it merges, it seems like it now has the bahamasID and and BahamasLeak tag, so it attempts to merge onto itself again, making a circle.
How do I avoid this?
All the sessions of the conference are now available online