Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
03-26-2021 05:00 PM
MATCH (n:Product)
MATCH (m:Product)
WHERE n.name = m.name AND NOT n.id = m.id
DETACH DELETE n
I have nodes with same names but created with different custom ids, which shouldn't happen. I want to find those nodes and delete one of the two nodes. Each such pair of nodes should be the same and I only want to keep one of them. My above command deletes both, because the m, n can occur in either ends. Is there a way to modify this and only delete one?
03-26-2021 07:47 PM
This is the data.
CREATE (:Product {id:0, name: 'A'}),
(:Product {id:1, name: 'A'}),
(:Product {id:2, name: 'B'}),
(:Product {id:3, name: 'B'}),
(:Product {id:4, name: 'C'});
This Cypher will erase all IDs except the first one.
This code may not be elegant, but it will work correctly.
MATCH (n:Product)
MATCH (m:Product)
WHERE n.name = m.name
AND NOT n.id = m.id
WITH n.name AS name, collect(n.id)[0] AS firstNodeId
MATCH (n:Product)
WHERE n.name = name
AND n.id <> firstNodeId
DETACH DELETE n;
03-26-2021 07:53 PM
NOT TESTED
First we retrieve all the distinct names in the database
MATCH (n)
WITH DISTINCT n.name AS name
Second we match each group of nodes corresponding to a name who has duplicates and we delete these nodes.
MATCH (n {name:name}) WHERE count(n) > 1
WITH n SKIP 1
DETACH DELETE n
These two statements must be part of the query when you paste them in your Neo4j Desktop. A Neo4j APOC function certainly exists for that purpose, these are generally much more short and efficient but less human friendly to write and read.
03-27-2021 10:48 AM
@tard.gabriel I tried, but seems not working:
MATCH (n)
WITH DISTINCT n.name AS Name
MATCH (n {name:Name}) WHERE count(n) > 1
WITH n SKIP 1
DETACH DELETE n
Invalid use of aggregating function count(...) in this context (line 3, column 29 (offset: 67))
"MATCH (n {name:Name}) WHERE count(n) > 1"
^
03-27-2021 11:34 AM
Just to chime in.
Could the original cypher query be tweaked a bit
to test the order of the id which would prevent both combinations from being true:
MATCH (n:Product)
MATCH (m:Product)
WHERE n.name = m.name AND n.id > m.id
DETACH DELETE n
Andy
03-27-2021 12:35 PM
TESTED
It's the most short, sweet and pretty solution I could come up with.
I think the DISTINCT operator is optional in this case but not sure.
MATCH (n)
WITH DISTINCT n.name AS name, collect(n) AS nodes
FOREACH (n IN tail(nodes) | DETACH DELETE n )
Keep in mind that the best solution is always to avoid creating duplicates by using the constraints before importing or creating any data.
By the way, message intended for the APOC developers, would be great to have a function to remove duplicates based on a node or relationship value an not only the whole thing.
*tail means taking every element in a list except the first one
*collect create a list from all the matching nodes in this case
If you have enjoyed this solution, please check the solution box, this would help me to provide more solutions in the future.
All the sessions of the conference are now available online