Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-29-2022 07:22 PM - edited 12-29-2022 07:28 PM
Forgive me if this is not the correct place to ask a question about cypher queries. I'm extremely new to neo4j and am curious if anyone has solved this problem before.
I've got a graph of parent/child relationships which form a tree of Person nodes. Each Person node has a property Name. Given a known list of Names, I need to test for the existence of a path where each string in the List represents a node in the path. Here's a conceptual hierarchy:
[Arthur]→[Bob]→[Charlie]→[Dave]
↳[Betty]→[Cheryl]
And the query cases I'm trying to solve for:
With ["Arthur", "Bob", "Charlie", "Dave"] AS Parentage
Match (p:Person WHERE p.Name="Arthur")
// ... somehow, return only the "Dave" node
With ["Arthur", "Betty"] AS Parentage
Match (p:Person WHERE p.Name="Arthur")
// ... somehow, return only the "Betty" node
With ["Arthur", "Mark"] AS Parentage
Match (p:Person WHERE p.Name="Arthur")
// ... somehow, don't return anything (because "Arthur" has no CHILD "Mark")
With ["Arthur", "Cheryl"] AS Parentage
Match (p:Person WHERE p.Name="Arthur")
// ... somehow, don't return anything (Note that "Cheryl" is not a direct CHILD of "Arthur")
Are the above scenarios possible to do as a Cypher query? I know that if I do something like this:
Match (p:Person WHERE p.Name="Arthur")-[r:CHILD*]->(b:Person)
RETURN r, a, b
It would provide me with all of the nodes and relationships, which I could then use application code to filter. But that seems like it could get expensive as nodes grow.
Solved! Go to Solution.
12-30-2022 01:14 PM
I think I understand. You want to get the leaf node of a path exactly described by a list of names. The path has to be in the same order as the names. I didn't know if it could or couldn't be longer than the list, so I provided both solutions.
leaf does not have to be a terminal node, i.e. the path found can have child nodes beyond the leaf.
With ["Arthur", "Bob", "Charlie", "Dave", "Eric", "Charlie"] AS Parentage
match (p:Person {Name: head(Parentage)})
match path = (p)-[:CHILD*]->(leaf:Person{Name: Parentage[-1]})
where length(path)+1 = size(Parentage)
and all(i in range(0, size(Parentage)-1) where nodes(path)[i].Name = Parentage[i])
return leaf
leaf node is required to be a terminal node:
With ["Arthur", "Bob", "Charlie", "Dave", "Eric", "Charlie"] AS Parentage
match (p:Person {Name: head(Parentage)})
match path = (p)-[:CHILD*]->(leaf:Person{Name: Parentage[-1]})
where length(path)+1 = size(Parentage)
and not exists((leaf)-[:CHILD]->(:Person))
and all(i in range(0, size(Parentage)-1) where nodes(path)[i].Name = Parentage[i])
return leaf
12-29-2022 07:26 PM
So I tried to get cute with apoc and the json functions. I added a "NamedChild" property to the relationship properties and tried to add a predicate to the relationship like so:
With ["Arthur", "Bob", "Charlie", "Dave"] AS Parentage
MATCH (p:Person WHERE p.Name = Parentage[0])-[r:CHILD* WHERE apoc.json.path(apoc.convert.toJson(r), "$..properties.NamedChild", []) = RequestPath[1..]]->(b:Person)
RETURN a, b, r
But it returned an error:
Relationship pattern predicates are not supported for variable-length relationships.
I do think the relationship properties is the angle to approach for this type of path finding... But I'm still pretty fresh to Cypher so I'm missing the mark a little bit.
12-29-2022 09:53 PM
I created this:
Query 1:
With ["Bob", "Charlie", "Dave"] as s1
Match (p:Person WHERE p.name="Arthur")-[r:CHILD*]->(b:Person)
where b.name in s1
return p, r, b
Query 2:
With ["Bobby", "Jane", "Bob"] as s1
Match (p:Person WHERE p.name="Arthur")-[r:CHILD*]->(b:Person)
where b.name in s1
return p, r, b
12-30-2022 09:04 AM
Thanks for the quick reply @ameyasoft , but it doesn't necessarily solve the problem. The where clause you wrote assumes uniqueness of Name properties which is not guaranteed (I should have been more specific). There could exist two persons with the same name:
↱[Alexandria]→[Beatrice]→[Caroline]→[Dave]
[Arthur]→[Bob]→[Charlie]→[Dave]
↳[Betty]→[Cheryl]
Or even a grandchild/descendant might be named for a grandparent:
↱[Alexandria]→[Beatrice]→[Caroline]→[Dave]
[Arthur]→[Bob]→[Charlie]→[Dave]→[Eric]→[Charlie]→[Arthur]
↳[Betty]→[Cheryl]→[Arthur]
In these cases, we'd have something like:
With ["Arthur", "Bob", "Charlie", "Dave", "Eric", "Charlie"] AS Parentage
Match (p:Person WHERE p.Name="Arthur")
// ... somehow, return only the "Charlie" node leaf, not the "Charlie" in the middle of the path
With ["Arthur", "Betty", "Cheryl", "Arthur"] AS Parentage
Match (p:Person WHERE p.Name="Arthur") // ... This initial match is overly simplified, please assume more than just name match
// ... somehow, return only the "Arthur" node leaf that is a child of "Cheryl", not the "Arthur" root nor the "Arthur" that is a child of the second "Charlie"
I'm starting to think there might not be a "pure cypher" way of doing this, it may just be that I have to return all of the matching nodes and then in code visit all of them to figure out which one matches the path I'm looking for.
12-30-2022 12:16 PM
You can ensure a node is a leaf node by requiring the node not to have an outgoing relations as a non-leaf node would. Try the following:
with "Arthur" as rootNode, "Charlie" as leafNode
match (p:Person {Name: rootNode})
match path=(p)-[:CHILD*]->(b:Person{Name: leafNode})
where not exists((b)-[:CHILD]->(b:Person))
return path
12-30-2022 12:34 PM
It's not really an "ensure I get the leaf" problem. It's not even a path finding problem. I'm provided with a path. It's more of a path validation problem... "Does this given path exist, and if so return the leaf of the provided path".
The Parentage List is the important part, it describes a list of Name properties for each Node in the known path:
With ["Arthur", "Bob", "Charlie", "Dave", "Eric", "Charlie"] AS Parentage
Starting from "Arthur" as the root node, check all of Arthur's children and if he has a child "Bob" continue.
Then with "Bob", check all of Bob's children and look for a child "Charlie". If Bob has a child "Charlie", continue.
Then with "Charlie", check all of Charlie's children and look for a child "Dave", if "Dave" exists continue...
Parentage[0] = Root Node
Parentage[1] = 1st level descendant from Root Node
Parentage[2] = 2nd level descendant from Root Node, direct descendant of Parentage[1]
Parentage[n] = nth level descendant from Root node, direct descendant of Parentage[n-1]
The indices of the Parentage list represent node relationships, the values of the List represent the property selectors for those relationships.
12-30-2022 01:14 PM
I think I understand. You want to get the leaf node of a path exactly described by a list of names. The path has to be in the same order as the names. I didn't know if it could or couldn't be longer than the list, so I provided both solutions.
leaf does not have to be a terminal node, i.e. the path found can have child nodes beyond the leaf.
With ["Arthur", "Bob", "Charlie", "Dave", "Eric", "Charlie"] AS Parentage
match (p:Person {Name: head(Parentage)})
match path = (p)-[:CHILD*]->(leaf:Person{Name: Parentage[-1]})
where length(path)+1 = size(Parentage)
and all(i in range(0, size(Parentage)-1) where nodes(path)[i].Name = Parentage[i])
return leaf
leaf node is required to be a terminal node:
With ["Arthur", "Bob", "Charlie", "Dave", "Eric", "Charlie"] AS Parentage
match (p:Person {Name: head(Parentage)})
match path = (p)-[:CHILD*]->(leaf:Person{Name: Parentage[-1]})
where length(path)+1 = size(Parentage)
and not exists((leaf)-[:CHILD]->(:Person))
and all(i in range(0, size(Parentage)-1) where nodes(path)[i].Name = Parentage[i])
return leaf
12-30-2022 01:48 PM
Bingo! The first cypher query was the solution I was after. Thanks @glilienfield
All the sessions of the conference are now available online