cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Finding a MRCA when there are intermediate MRCAs

genealogy
Graph Buddy

I have a set of nodes and edges created thus ...

create (n1:Item{rn:1})
create (n2:Item{rn:2})
create (n3:Item{rn:3})
create (n4:Item{rn:4})
create (n5:Item{rn:5})
create (n6:Item{rn:6})
create (n7:Item{rn:7})
create (n8:Item{rn:8})
create (n9:Item{rn:9})

merge (n1)-[:parent]->(n3)
merge (n2)-[:parent]->(n3)
merge (n3)-[:parent]->(n7)
merge (n4)-[:parent]->(n6)
merge (n5)-[:parent]->(n6)
merge (n6)-[:parent]->(n8)
merge (n7)-[:parent]->(n9)
merge (n8)-[:parent]->(n9)

The graph looks like this:

I then run this query to get all the MRCAs for the node pair combinations:

match p=(n1:Item)-[:parent*0..5]->(MRCA:Item)<-[:parent*0..5]-(n2:Item) where n1.rn<n2.rn and n2.rn<>MRCA.rn
with n1,n2,MRCA,reduce(s='',x in nodes(p)|s + case when x.rn=MRCA.rn then '*' + x.rn + '*' else x.rn end  + ':') as path
return n1.rn,n2.rn,MRCA.rn,path order by n1.rn

But my desire is to get the ancestor shared by all the Item nodes (e.g., 9). Is there a query that will do this?

Dave

4 REPLIES 4

paulare
Graph Buddy

Maybe this helps - i filtered out intermediate MRCAs that have only 2 nodes.

MATCH p=(n1:Item)-[:parent*0..5]->(MRCA:Item)<-[:parent*0..5]-(n2:Item) 
WHERE n1.rn<n2.rn and n2.rn<>MRCA.rn
WITH apoc.coll.union(collect(distinct n1.rn), collect(distinct n2.rn)) as NodesShareMRCA ,MRCA
WITH NodesShareMRCA, size(NodesShareMRCA) as SizeNodesShareMRCA, MRCA
WHERE SizeNodesShareMRCA > 2
RETURN NodesShareMRCA, MRCA.rn
ORDER by SizeNodesShareMRCA DESC

╒═════════════════╤═════════╕
│"NodesShareMRCA" │"MRCA.rn"│
╞═════════════════╪═════════╡
│[1,2,3,4,5,6,7,8]│9        │
└─────────────────┴─────────┘

genealogy
Graph Buddy

Excellent. Solves the problem!

genealogy
Graph Buddy

While the initial solution solved the initial problem, the problem was too simple. In the real world there are often more than two child nodes and the number is not uniform. Thus, the filter proposed only works in the specific example.

We can restate the problem. Create the nodes and edges:

create (n1:Item{rn:1})
create (n2:Item{rn:2})
create (n3:Item{rn:3})
create (n4:Item{rn:4})
create (n5:Item{rn:5})
create (n6:Item{rn:6})
create (n7:Item{rn:7})
create (n8:Item{rn:8})
create (n9:Item{rn:9})
create (n10:Item{rn:10})
create (n11:Item{rn:11})
create (n12:Item{rn:12})
create (n13:Item{rn:13})
create (n14:Item{rn:14})

merge (n1)-[:parent]->(n3)
merge (n2)-[:parent]->(n3)
merge (n3)-[:parent]->(n7)
merge (n4)-[:parent]->(n6)
merge (n5)-[:parent]->(n6)
merge (n6)-[:parent]->(n8)
merge (n7)-[:parent]->(n9)
merge (n8)-[:parent]->(n9)
merge (n10)-[:parent]->(n3)
merge (n11)-[:parent]->(n3)
merge (n12)-[:parent]->(n6)
merge (n13)-[:parent]->(n8)
merge (n9)-[:parent]->(n14)

this produces this graph:

Is the q query that produces the MRCA (rn=9) for the terminal child nodes 1,2,4,5,10,11,12, and 13?

genealogy
Graph Buddy

This query produces the desired result:

//starting list of terminal nodes
match (c:Item) where c.rn in [1,2,4,5,10,11,12]
with c order by c.rn
with collect(distinct c.rn) as cc
//get path to ancestors of these nodes
match p=(c2:Item)-[:parent*0..9]->(MRCA:Item) where c2.rn in cc
//get path lengths to ancestors and order descendants
with MRCA,cc,c2,length(p) as PL order by c2.rn
//collect ordered descendants and order by path lenght 
with MRCA,cc,PL,collect(distinct c2.rn) as cc2 order by PL 
//filter for descendant list that is the same as the starting list and collect MRCAs
with cc,cc2,collect(MRCA.rn) as MRCAs  where cc2=cc
//return the MRCA with the shortest path length
return head(MRCAs) as mrca

However, this may require additional list filtering if the initial list does not include all the descendants of the final mrca.

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online