cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Multiple hops query

Hi everyone,

I have the following situation (as shown in the image below), where I have to return always the latest tagged Person node on the path while having to traverse an undefined number of hops in the HAS_PARENT relationship type.

UseCaseUseCase

Anyone any idea on a performant query that would satisfy all the cases? Thanks!

1 ACCEPTED SOLUTION

Here is a solution:

MATCH (family:Family)<-[:BELONGS_TO]-(person) 
CALL apoc.path.expandConfig(person, {
    relationshipFilter: "HAS_PARENT", 
    labelFilter: "+Person", 
    filterStartNode: true, 
    uniqueness: "RELATIONSHIP_PATH", 
    bfs: false
}) 
YIELD path 
WHERE last(nodes(path)).tagged = true 
WITH family, nodes(path) AS nodes 
WITH family, apoc.coll.indexOf(nodes, last(nodes)) AS index, last(nodes) AS node 
WITH family, apoc.coll.sortMaps(collect({index: index, node: node}), "index")[0] AS result 
RETURN family.id AS family, result.node.id AS node

Regards,
Cobra

View solution in original post

18 REPLIES 18

Cobra
Ninja
Ninja

Hello @busymo16 🙂

This query should do what you asked (you may have to change some config parameters or the WHERE clause)

 

MATCH (family:Family) 
CALL apoc.path.expandConfig(family, {
    relationshipFilter: "<BELONGS_TO|HAS_PARENT>", 
    labelFilter: "+Person", 
    filterStartNode: false, 
    uniqueness: "RELATIONSHIP_PATH", 
    bfs: false
}) 
YIELD path 
WHERE last(nodes(path))).tagged = true 
WITH family, path, length(path) AS hops 
ORDER BY hops DESC LIMIT 1 
RETURN family, path, hops

 

Regards,
Cobra

Hi @Cobra 

Thanks for your answer. Actually, there might be another thing important to mention. It exists also the possibility that

(family: Family)-[:HAS_PARENT]->(parentFamily: Family)

so in this case I would also get the results for parentFamily, right? Also, I am not interested in the path of it but in the node itself, as I use it later on in a subsequent query. Is there a way to set the limit: * (just as an unknown depth)? 

Regards,

Busymo

Hi @busymo16,

It will not get info from the parentFamily due to Family not being whitefiltered.

Please consider that this approach will first open the entire tree and just after retrieving all of the, it will prune non valid paths. 

You have two options, you can use the property tagged as a Label, so then you can add it to the labelFilter as a Terminator. Or, if you don't want or can't modify your model, you may need to check the traversal api framework. 

Bennu

 

Oh, y’all wanted a twist, ey?

Hi @bennu_neo,

Thanks for the explanation. The point is that the model cannot change at the moment speaking but also what I think happens, in this case, is that even the shorter paths are considered valid paths. The only issue is that I am interested in the last-tagged person and not the entire path above him. So basically, I only am interested in the last Person node that is tagged, so considering the depth-first search, it would be:

go until the last Person node in the path,
check if it is tagged
         if yes, return it and break
         if not, continue up the tree until (line 3) is the case.
         if (line 3) is not possible, go to the next path
  1.  

Hi @busymo16 !

I got your point. This fairly more complex than expected. I'll think about it.

B

Oh, y’all wanted a twist, ey?

Hi @Cobra,

So basically your suggested query returns the family and also as a path is returning the family as well. I extended it to this one:

 

MATCH (family:Family)<-[:BELONGS_TO]-(person:Person)
CALL apoc.path.expandConfig(person, {
    relationshipFilter: "<HAS_PARENT>", 
    labelFilter: "+Person", 
    filterStartNode: true, 
    uniqueness: "RELATIONSHIP_PATH", 
    bfs: false
}) 
YIELD path
WHERE last(nodes(path)).tagged = true
RETURN last(nodes(path))

 

which works fine for CASE III, but not for CASE I and II. The reason for it is that I get also the upper tagged Person nodes, where I am interested only in the last tagged one. Would filtering on uniqueness: "RELATIONSHIP_PATH|LABEL" satisfy CASE I and II as well? Thanks!

uniqueness setting only accept one value so it's not possible. Can you provide some Cypher queries to create your example please?

Hi @Cobra 
Sure.  The following query creates some fake data with the model.

 

MERGE (p:Person{id:5})
MERGE (p1:Person{id: 6})
MERGE (p2:Person{id: 7})
MERGE (p3:Person{id: 8})
MERGE (p4:Person{id: 9})
MERGE (p5:Person{id: 10})
MERGE (p6:Person{id: 11})
MERGE (p7:Person{id: 12})
MERGE (p8:Person{id: 13})
SET p.tagged = true,
p1.tagged = false,
p2.tagged = true,
p3.tagged = false,
p4.tagged = false,
p5.tagged = true,
p6.tagged = false,
p7.tagged = true,
p8.tagged = false
MERGE (p)-[:BELONGS_TO]->(f:Family{id:1})
MERGE (p1)-[:HAS_PARENT]->(p)
MERGE (p2)-[:HAS_PARENT]->(p1)
MERGE (p3)-[:BELONGS_TO]->(f1:Family{id:2})
MERGE (p4)-[:HAS_PARENT]->(p3)
MERGE (p5)-[:HAS_PARENT]->(p4)
MERGE (p6)-[:BELONGS_TO]->(f2:Family{id:3})
MERGE (p7)-[:BELONGS_TO]->(f3:Family{id:4})
MERGE (p8)-[:HAS_PARENT]->(p7)

 

If you then see the model it will look like this: 

Screenshot 2022-11-09 at 16.09.38.pngScreenshot 2022-11-09 at 16.11.10.png

 

 

 

 

 

 

 

 

While executing the query below: 

 

MATCH (family:Family)<-[:BELONGS_TO]-(person:Person)
CALL apoc.path.expandConfig(person, {
    relationshipFilter: "<HAS_PARENT>", 
    labelFilter: "+Person", 
    filterStartNode: true, 
    uniqueness: "RELATIONSHIP_PATH", 
    bfs: false
}) 
YIELD path
WHERE last(nodes(path)).tagged = true
RETURN last(nodes(path)).id

 

The result is nodes with id:  5, 7, 10, 12

The expected result would be 7, 10, 12

 

Here is a solution:

MATCH (family:Family)<-[:BELONGS_TO]-(person) 
CALL apoc.path.expandConfig(person, {
    relationshipFilter: "HAS_PARENT", 
    labelFilter: "+Person", 
    filterStartNode: true, 
    uniqueness: "RELATIONSHIP_PATH", 
    bfs: false
}) 
YIELD path 
WHERE last(nodes(path)).tagged = true 
WITH family, nodes(path) AS nodes 
WITH family, apoc.coll.indexOf(nodes, last(nodes)) AS index, last(nodes) AS node 
WITH family, apoc.coll.sortMaps(collect({index: index, node: node}), "index")[0] AS result 
RETURN family.id AS family, result.node.id AS node

Regards,
Cobra

Hi @Cobra,

Thanks for your answer. Logically this is the expected behavior but it is quite not performant for bigger use cases. I will have to use it in a graph that is going over through 100mil nodes more or less. I have also part of the query before this logic, and parts of it after the query. Any idea on speeding it up?

Thanks,

As said by @bennu_neo, you should change a bit your data model. For now, I don't have other ideas to speed-up the query ☹️

@busymo16 I did some testing with the new k-hop optimizations in v5. There are some requirements for it to kick in (hence the 0..999 and the distinct in my example), but this seems to work, if I'm understanding your requirements correctly:

match (f:Family)<-[:BELONGS_TO]-(:Person)-[:HAS_PARENT*0..999]->(p:Person {tagged:true})
return distinct p

@steggy Not quite what I would like to achieve according to the requirements. If you would create the graph with the above (also in this comment link: https://community.neo4j.com/t5/neo4j-graph-platform/multiple-hops-query/m-p/61907/highlight/true#M36...), the expected result would be 7, 10, 12 as ids to be returned. Meanwhile the query you suggest also in v.5.1 returns only 5, and 12 which is not what I expect.

I tried with neo4j:5.1.0-enterprise, as per the official docker image. Did you tune/add any parameters in the configuration?

@busymo16 Yes, you're right. I'm missing some nuance of the requirements that I cannot see no matter how many times I read the thread. I've been testing with your data - what am I missing that is eliminating node 7 from the results?

@steggy No worries. Yes, eliminating nodes 7 and 10 and adding 5 instead. So the result to be correct with the requirements should be  7, 10, 12.

Hi @busymo16 ,

Am I missing something? I thought the answer for your example was only 7 and  10. 5 is not the last one on a possible path.

Oh, y’all wanted a twist, ey?

Hi @bennu_neo

Sorry my mistake in fast typing, it should be: 7, 10, 12

Hi @busymo16 ,

Do you trust me? If you do, please put this jar (https://drive.google.com/drive/folders/1hC2PSUjvmcrdTfKcRSz15WnT4vG8DuRE?usp=sharing) on the plugin folder and execute the query as follows:

MATCH (f:Family)<-[BELONGS_TO]-(n) 
WITH f, collect(n) as nn
CALL bennu.test(nn) yield node
return f, node

  

Oh, y’all wanted a twist, ey?