cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Get all relationships within a list of nodes

emmamontarsolo
Node Link

Hi ! I'm currently working on a project using Python and Neo4J 3.5. I'm dealing with voluminous data, and my goal to find links between "important" nodes. I wrote the following query in order to find the shortest path between every pair of "important" nodes (considering only path of length1 or 2) :

 

 

MATCH path = shortestPath( (n1)-[*..2]-(n2) )
WHERE n1:IMPORTANT and n2:IMPORTANT and id(n1)>id(n2)
RETURN path

 

 

To get the additional links between intermediates nodes,  this result  is completed with a second query :

 

 

MATCH ()-[r]-()
RETURN r

 

 

The result of the second query was filtered (via python) to only keep the relationships between the nodes obtained through the first query. 

I'm trying to improve the code, so that the result can be obtained through one query. I write the following query :

 

 

MATCH path = shortestPath( (s1)-[*..2]-(s2) )
WHERE s1:IMPORTANT and s2:IMPORTANT and id(s1)>id(s2)
WITH nodes(path) as nodeslist
MATCH p = (m)-[r]-(n)
WHERE m in nodeslist AND n in nodeslist AND id(m)>id(n)
RETURN p

 

 

However this doesn't seem to work : the links between intermediate (non important) nodes are not returned.

Here is a little set up to reproduce the error: 

 

 

MERGE (n1:IMPORTANT {name:'Emma'})
MERGE (n2:IMPORTANT {name:'David'})
MERGE (n3:IMPORTANT {name:'Peter'})
MERGE (n4:NEUTRAL {name:'Paul'})
MERGE (n5:IMPORTANT {name:'Mary'})
MERGE (n6:NEUTRAL {name:'Jane'})
MERGE (n7:NEUTRAL {name:'John'})
MERGE (n1) - [r1:KNOWS] - (n2)
MERGE (n2) - [r2:KNOWS] - (n4)
MERGE (n2) - [r3:KNOWS] - (n6)
MERGE (n4) - [r4:KNOWS] - (n3)
MERGE (n4) - [r5:KNOWS] - (n6)
MERGE (n5) - [r6:KNOWS] - (n6)
MERGE (n7) - [r7:KNOWS] - (n1)

 

 

The full graph (edges are not oriented)

complete graphcomplete graph

Expected result:

expected resultexpected result

Python code:

 

 

driver = GraphDatabase.driver(uri, auth=(username, password))
session= driver.session()
query =  """MATCH path = shortestPath( (s1)-[*..2]-(s2) )
            WHERE s1:IMPORTANT and s2:IMPORTANT and id(s1)>id(s2)
            WITH nodes(path) as nodeslist
            MATCH p = (m)-[r]-(n)
            WHERE m in nodeslist AND n in nodeslist AND id(m)>id(n)
            RETURN p"""
graph  = session.run(query).graph()
print([n["name"] for n in graph.nodes])
print(["-".join([n["name"] for n in r.nodes]) for r in graph.relationships])

 

Result : 5 nodes, and 5 relationships (instead of 6 relationships). The edge between Paul and Jane is lacking.

['David', 'Emma', 'Paul', 'Peter', 'Jane', 'Mary']
['Emma-David', 'Paul-Peter', 'David-Paul', 'David-Jane', 'Mary-Jane']

 

 

 Is my query misleading ?  Neo4j version is 3.5, neo4j python lib is 4.4.4.

1 ACCEPTED SOLUTION

I solves my issue with this query :

MATCH path = shortestPath( (s1)-[*..2]-(s2) )
WHERE s1:IMPORTANT and s2:IMPORTANT and id(s1)>id(s2)
WITH nodes(path) as nodeslist
UNWIND nodeslist as n
WITH COLLECT(DISTINCT n) as flatnodeslist
MATCH p = (m)-[r]-(n)
WHERE id(m)>id(n) AND n in flatnodeslist AND m in flatnodeslist
RETURN p

nodes(path) returns a list of lists, containing the nodes in all the different shortest paths. Therefore there was no sub list where Jane and Paul were both in. To get the actual list of nodes, I had to flatten the list using the collect distinct syntax.

 

View solution in original post

5 REPLIES 5

You can get the relationships along a path with relationships(path). The relations returned have the ids of the start and end nodes, the relationship properties, and its type 

Thank you for four answer; but I don't see how it can solve my issue ?

The following query gives the exact same result : the relationship between Jane and Paul is still missing.

MATCH path = shortestPath( (s1)-[*..2]-(s2) )
WHERE s1:IMPORTANT and s2:IMPORTANT and id(s1)>id(s2)
WITH nodes(path) as nodeslist
MATCH p = (m)-[r]-(n)
WHERE m in nodeslist AND n in nodeslist AND id(m)>id(n)
RETURN relationships(p), nodes(p)

 

I solves my issue with this query :

MATCH path = shortestPath( (s1)-[*..2]-(s2) )
WHERE s1:IMPORTANT and s2:IMPORTANT and id(s1)>id(s2)
WITH nodes(path) as nodeslist
UNWIND nodeslist as n
WITH COLLECT(DISTINCT n) as flatnodeslist
MATCH p = (m)-[r]-(n)
WHERE id(m)>id(n) AND n in flatnodeslist AND m in flatnodeslist
RETURN p

nodes(path) returns a list of lists, containing the nodes in all the different shortest paths. Therefore there was no sub list where Jane and Paul were both in. To get the actual list of nodes, I had to flatten the list using the collect distinct syntax.

 

I believe your solution has an error in it.  If you execute the following query, you get the path results shown below, which lists the nodes along each found path. 

MATCH path = shortestPath( (s1)-[*..2]-(s2) )
WHERE s1:IMPORTANT and s2:IMPORTANT and id(s1)>id(s2)
return nodes(path)

 Screen Shot 2022-10-04 at 2.29.48 PM.png

As shown, there are five relationships within these three paths. The result of your query shows six relationships, as shown below:

Screen Shot 2022-10-04 at 2.32.38 PM.png

The extra relationship is the last one between 'Jane' and 'Paul'. It must be coming from line six in your query.   That match is looking for relationships between any two nodes that are members of the paths. It overlooks the fact that these nodes can have relationships that are not part of the path results. 

You can get the results you want from the following query. Note, the 'distinct' is not needed for the test data, but may be needed for a more generalized data set. 

match path = shortestPath( (s1)-[*..2]-(s2) )
where s1:IMPORTANT and s2:IMPORTANT and id(s1)>id(s2)
unwind relationships(path) as rel
with distinct rel
return startNode(rel), endNode(rel), labels(startNode(rel)), labels(endNode(rel))

Screen Shot 2022-10-04 at 3.07.23 PM.png

But as I mentioned in my first post the point here was to get the shortest paths between important persons AND additional relationships between the nodes returned in the shortest path ...

So in this case in want the paths Emma->David; David->Paul ->Peter;  David -> Jane->Mary; and since Paul and Jane are part of the returned nodes and are related, I want to show the link between Paul and Jane.

As I stated in my first post, my goal was to add the relationship between Jane and Paul. The first query was sufficient to get every node in the shortest path, but my goal was to add additional existing relationship between the returned nodes.