cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Cypher query to delete all but one of the same type edges between nodes based on property value

dg_22
Node Clone

I have sets of nodes that sometimes have multiple edges of the same type that differ in values for their properties. In such cases I would like to keep one edge based on for instance having the largest value for a specific property of interest while deleting the rest. What would be the best way to do this?

It's similar to the post below but in my case I want to keep one edge based on the edge that has the max or min value between sets of edges between two nodes for a specific property. And this needs to be applied throughout the graph for each set of nodes with multiple edges of the same type

Delete all of the same relationship type between two node but still keep one of them

I tried something like the below but it doesn’t quite work and the result value usually will be unique for each set of nodes.

MATCH (n:Node_A)-[e:edge_B]->(m:Node_C)
WITH COLLECT(e) AS edges, MAX(e.property_a) AS result
UNWIND edges as edge
WITH edge
WHERE edge.property_a <> result
DELETE edge

1 ACCEPTED SOLUTION

Your solutions was missing one thing.  You want to group on the 'n' and 'm' nodes, so you are collecting only the edges between each pair of these nodes and calculating its max. In your implementation, you are collection all edges 'e', regardless of nodes 'n' and 'm', and calculating the max of property_a over all edges. 

I also included the case when edge.property_a is null, in case this is possible. This will always remove the edges with a null value. These edges will not deleted pass the predicate 'null<>result'; therefore, will never be deleted regardless of if there is an edge with a non-null value.  Remove the condition it if is not applicable/wanted.

MATCH (n:Node_A)-[e:edge_B]->(m:Node_C)
WITH n, m, COLLECT(e) AS edges, MAX(e.property_a) AS result
UNWIND edges as edge
WITH edge, result
WHERE edge.property_a <> result or edge.property_a is null
DELETE edge

 

View solution in original post

2 REPLIES 2

Your solutions was missing one thing.  You want to group on the 'n' and 'm' nodes, so you are collecting only the edges between each pair of these nodes and calculating its max. In your implementation, you are collection all edges 'e', regardless of nodes 'n' and 'm', and calculating the max of property_a over all edges. 

I also included the case when edge.property_a is null, in case this is possible. This will always remove the edges with a null value. These edges will not deleted pass the predicate 'null<>result'; therefore, will never be deleted regardless of if there is an edge with a non-null value.  Remove the condition it if is not applicable/wanted.

MATCH (n:Node_A)-[e:edge_B]->(m:Node_C)
WITH n, m, COLLECT(e) AS edges, MAX(e.property_a) AS result
UNWIND edges as edge
WITH edge, result
WHERE edge.property_a <> result or edge.property_a is null
DELETE edge

 

dg_22
Node Clone

Great, thank you very much. I appreciate it!