cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Get all Parent Child pairs from a Tree starting at a certain Node all the way down

Hello, I'm new to cypher and currently stuck on a problem that it seems to me ought to be quite straight forward, I suspect it might be a conceptual issue but I'm not sure, any help greatly appreciated.

We've built a graph of nodes. Some of these are "Leaf" nodes and their "tree" begins with the "Root" node. Apart from the Root, each Leaf node (a , b, c, d, etc) in the tree has one Parent Leaf, and 0..n children.

The tree is of unknown depth.

(parent:Leaf) –[PARENT_OF]->(child:Leaf)

There are other types of node, and other types of Link, but for this puzzle I need to just work with the Leafs and PARENT_OF

I want to return a table of pairs: ParentID, ParentName, ChildID, ChildName

This code does it quite nicely:

MATCH (parent:Leaf) -[PARENT_OF]->(child:Leaf)
RETURN ID(parent) AS ParentID, parent.Name, ID(child) AS ChildID, child.Name
ORDER BY parent.Name ASC, child.Name ASC

Result:

ParentIDparent.NameChildIDchild.Name
219Root194a
219Root195b
194a193c
194a192d
195b261j

But things get really complicated when I try and get just a section of the tree, ie: I want to pick a Leaf and get that Leaf and all the Pairs beneath it in the tree at any depth.

I've tried this:

MATCH (parent:Leaf) -[PARENT_OF]->(child:Leaf)
WHERE parent.Name = "Root"
RETURN ID(parent) AS ParentID, parent.Name, ID(child) AS ChildID, child.Name
ORDER BY parent.Name ASC, child.Name ASC

(here we've begun at the Root, which ought to return the whole tree, but in practice I'd want to specify a different starting node each time)

And it just gives the children of the Root.

ParentIDparent.NameChildIDchild.Name
219Root194a
219Root195b

I've tried this:

MATCH (startleaf:Leaf) -[sp:PARENT_OF]->(parent:Leaf) -[pc:PARENT_OF]->(child:Leaf)
WHERE startleaf.Name = "Root"
RETURN ID(parent) AS ParentID, parent.Name, ID(child) AS ChildID, child.Name
ORDER BY parent.Name ASC, child.Name ASC

But we no longer get the Start Leaf, and we only get 2 levels down.

ParentIDparent.NameChildIDchild.Name
194a193c
194a192d
195b261j
195b262k

Now, obviously we can't go on adding (x)-[PARENT_OF]>(y) to cope with any depth, and because we're looking at an unknown depth, it seems to me that the wildcard might help:

MATCH (startleaf:Leaf) -[sp:PARENT_OF *0..1]->(parent:Leaf) -[pc:PARENT_OF *]->(child:Leaf)
WHERE startleaf.Name = "Root"
RETURN ID(parent) AS ParentID, parent.Name, ID(child) AS ChildID, child.Name
ORDER BY parent.Name ASC, child.Name ASC

But no:

Firstly this produces far too many rows, it's a row from each leaf to EVERY leaf below it, not just its immediate children.

Secondly, cypher complains at me and says that including the link type with a wild card, eg: [pc:PARENT_OF *] or [PARENT_OF *] is frowned upon and will suddenly be snatched away in a future version, so I don't want to use that.

(In real life there are other nodes types hanging around , and other relation types too so I do need to make sure I'm just dealing the Leafs connected by PARENT_OF)

My research is currently taking me in the direction that I need to:

a) Allow for the return of many rows with variable path length and link types (which seem to come back as lists of relationships?)

b) break each row apart, somehow check each row to see that it's only 1 link long and the link is PARENT_OF

c) only return the rows that meet the criteria

If that is the case, then any pointers on how to do that would be good.

This doesn't work, but is it something along these lines:

 

MATCH path = (startleaf:Leaf {Name:'Root'})-[*0..1]-> (parent:Leaf)-[*]->(child:Leaf)
WITH
[node in nodes(path) WHERE "Leaf" IN labels(node) | node] as leafs,
length(path) as PathLength,
[relationship in relationships(path) ] as PathRels,
[relationship in relationships(path) | type(relationship)] as RelType
WHERE RelType <> "" AND PathLength =1
RETURN
[leaf IN leafs | ID(leaf)] AS ParentID, [leaf IN leafs | leaf.Name] AS ParentName, 
PathLength,  PathRels, RelType

But it seems to me that really this problem shouldn't be this complicated, I suspect I'm missing something obvious.

What is the best way to:

Specify a starting node in a tree...and then get that node's children...and their children...all the way down ...expressed as a nice table of Parent-Child pairs:

 

ParentIDparent.NameChildIDchild.Name
219Root194a
219Root195b
194a193c
194a192d
195b261j

Thank you for your help.

 

 

1 ACCEPTED SOLUTION

Hi @Charlie_B ,

Next query should do the work. Install APOC if you haven't, and check the relationshipFilter behavior. Lemme know how it goes.

MATCH(l:Leaf)
WHERE l.property = $propertyVal //Inster here your initial node logic
with l
call apoc.path.subgraphAll(l, {
    relationshipFilter : "PARENT_OF>"
}) yield relationships
UNWIND  relationships as r
with startNode(r) as s, endNode(r) as e
return ID(s) AS ParentID, s.Name, ID(cehild) AS ChildID, e.Name

Bennu

Oh, y’all wanted a twist, ey?

View solution in original post

2 REPLIES 2

Frankly, I believe cypher is not designed for your problem. As you observed, it is great for finding paths. It is not a scripting language and it doesn’t support recursive algorithms. 

you could write a custom procedure to do exactly what you want. It is straight forward if you know Java. 

the other possibility is looking at the apoc library. The following procedure may be applicable. 

https://neo4j.com/labs/apoc/4.1/overview/apoc.path/apoc.path.subgraphAll/

This is something else you could look at:

https://neo4j.com/docs/graph-data-science/current/algorithms/pregel-api/

Hi @Charlie_B ,

Next query should do the work. Install APOC if you haven't, and check the relationshipFilter behavior. Lemme know how it goes.

MATCH(l:Leaf)
WHERE l.property = $propertyVal //Inster here your initial node logic
with l
call apoc.path.subgraphAll(l, {
    relationshipFilter : "PARENT_OF>"
}) yield relationships
UNWIND  relationships as r
with startNode(r) as s, endNode(r) as e
return ID(s) AS ParentID, s.Name, ID(cehild) AS ChildID, e.Name

Bennu

Oh, y’all wanted a twist, ey?