cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Import CSV to Merge Nodes

mkretsch
Node Clone

I have been importing some large corporate data that has a lot of duplicates. I can identify the duplicates and want to merge them,

:auto USING PERIODIC COMMIT 2000
LOAD CSV WITH HEADERS FROM 'file:///BAHAMAS_MERGE_1.csv' AS row
MATCH (a1:PanamaPapers {panamaID: row.panamaID}), (a2:BahamasLeaks {bahamasID: row.bahamasID})
WITH head(collect([a1,a2])) as nodes
CALL apoc.refactor.mergeNodes(nodes,{
properties:"combine",
mergeRels:true
})
YIELD node
RETURN count(*)

I receive the error Cannot delete node<20202439>, because it still has relationships. To delete this node, you must first delete its relationships.
Any way to optimize this so I do not run into this issue? When it merges, it seems like it now has the bahamasID and and BahamasLeak tag, so it attempts to merge onto itself again, making a circle.

How do I avoid this?

0 REPLIES 0