cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Trouble excluding subtrees and branches

Hi!, at work we are trying to implement Neo4J for the first time so we are beginers with graph databases.

We are having a hard time understanding how to exclude subtrees from an initial tree. Below is an image of an example of our graph, it is a simple parent-child tree with a "level" atribute on each node.

Screen Shot 2022-08-30 at 18.22.19.png

The problem comes when we want to apply our first business rule: "when you find a child node where the level is greater or equals than his father, then the child must be excluded together with all his descendants. All the relations and nodes of the subtree must not be part of the final result".

We tried with:

match (p:Person {id:"A" })<-[r:CHILD_OF*]-(c:Person)
where c.level < p.level
return c,p;

But unfortunately (as expected) we are getting undesired nodes and relations because we can not traverse and exclude the complete subtrees with its branches:

Screen Shot 2022-08-30 at 18.06.53.png

What we are trying to achieve is this result:

Screen Shot 2022-08-30 at 18.07.52.png

Any help would be greatly appreciated, thank you.

 

1 ACCEPTED SOLUTION

Well, yesterday after consulting a Graph Database expert, the query to answer this requirement was:

MATCH (p:Person {name:"A" })<-[ :CHILD_OF*]-(c:Person)
WHERE c.level >= p.level
WITH p,  collect(c) as allChildren 
MATCH (p)<-[ :CHILD_OF*]-(child:Person) 
WHERE child.level < p.level AND ALL(c in allChildren WHERE NOT EXISTS( (c)<-[:CHILD_OF*]-(child)))
RETURN p, child

 

View solution in original post

5 REPLIES 5

Cypher is very good at find paths based on patterns. It is not very good at traversal algorithms. This would be a trivial to implement in a custom procedure. If you don't know, custom procedures (and functions) are written using the Neo4j Java API. They are compiled into a jar and deployed on the neo4j server in the server's plugin folder. If you have access to your server and this interested you and you would like help getting started, I could provided a working solution that you could use as a starting point. 

https://neo4j.com/developer/java-procedures/

https://neo4j.com/docs/java-reference/current/#procedures

https://neo4j.com/docs/java-reference/current/extending-neo4j/procedures/

You can try something in cypher; although, the efficiency may be an issue because the algorithm would traverse some nodes multiple times.  Anyway, the following query should return a list of nodes that makes up the subgraph that meets your criteria.

match path=(p:Person {id:"A"})<-[r:CHILD_OF*]-(c:Person)
with nodes(path) as nodes
with nodes, range(0, size(nodes)-2) as indexes
where all(i in indexes where nodes[i].level > nodes[i+1].level)
unwind nodes as node
return distinct node as prunedSubgraphNodes

Sorry, I didn't have test data to test it. 

Hi glilienfield! and thank you for your answer!.

I tested your query and I'm getting a truncated result, the subtree hanging from the "F" node is missing:

Screen Shot 2022-08-31 at 12.32.37.png

We are getting a visit from a neo4j worker in two weeks, but I think that It will be a commercial presentation (advantage and disadvantage of the product, case studies, sucess stories, etc.) not someone with technical knowledge. 

We know how to generate and export the data from our relational database, model it and load it to Neo, but we don't know how to answer business needs and questions with it.

According to the requirements, the descendant nodes of 'F' do not meet the requirements of node 'F's level being strictly greater than the level of the child node. Node 'F' has a level of 1, while nodes 'I', 'J', and 'K' have level values 1, 2, 1, respectively. 

Did I use the correct condition? 

Writing the queries to answer business questions just takes a little effort learning cypher. It makes a lot of sense once you grasp the concepts. I found it more intuitive than SQL. If you data naturally can be modeled as a graph, then a property graph database will be a much better solution than a relational database. 

Don't overlook the power of writing custom procedures. It will allow you to efficiently implement traversal algorithms. 

ameyasoft
Graph Maven

Node 'B' and 'J' have Level 2 and node 'I' is Level 1. 'I', 'J' and 'K' are children of 'F'. Here 'I' and 'K'  have Level 1. It would be nice if you can explain the logic behind Level numbers. Thanks

Well, yesterday after consulting a Graph Database expert, the query to answer this requirement was:

MATCH (p:Person {name:"A" })<-[ :CHILD_OF*]-(c:Person)
WHERE c.level >= p.level
WITH p,  collect(c) as allChildren 
MATCH (p)<-[ :CHILD_OF*]-(child:Person) 
WHERE child.level < p.level AND ALL(c in allChildren WHERE NOT EXISTS( (c)<-[:CHILD_OF*]-(child)))
RETURN p, child

 

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online