Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
04-14-2020 02:47 PM
Hi, there.
I am trying to use the newest Game of Thrones dataset from the sandbox graph data science to get the shortest path only to season 3 nodes and relationships. I tried:
match (p:Person)-[r:INTERACTS]->(q:Person) where r.book=3 with p as a, q as b
match pa = shortestPath((a)-[:INTERACTS*]-(b)) where id(a) > id(b)
return length(pa) as len, extract(x IN nodes(pa) | x.name) as path order by len desc limit 5
I first filter the nodes and relationships to season 3 only but 'a' and 'b' are closely connected, so the result lengths are all 1. I wonder if there is a way to finish this task. I also try filter r in shortestPath but it took too long and the sandbox failed to execute.
Solved! Go to Solution.
04-14-2020 04:09 PM
You're kind of close, you just need to ensure you generate a cross product for all pairs of characters appearing in season 3 (which we can do be collecting every person with a season 3 appearance then UNWINDing that list twice to get the cross product, then do the id filter to deduplicate), then you can use shortestPath(), ensuring that all relationships in the pattern must be from season 3.
explain
MATCH (p:Person)-[r:INTERACTS]-()
WHERE r.book = 3
WITH collect(DISTINCT p) as book3Ppl
UNWIND book3Ppl as personA
UNWIND book3Ppl as personB
WITH personA, personB
WHERE id(personA) < id(personB)
MATCH path = shortestPath((personA)-[:INTERACTS*]-(personB))
WHERE all(rel in relationships(path) WHERE rel.book = 3)
RETURN length(path) as len, [x in nodes(path) | x.name] as path
ORDER BY len DESC
LIMIT 5
04-14-2020 04:09 PM
You're kind of close, you just need to ensure you generate a cross product for all pairs of characters appearing in season 3 (which we can do be collecting every person with a season 3 appearance then UNWINDing that list twice to get the cross product, then do the id filter to deduplicate), then you can use shortestPath(), ensuring that all relationships in the pattern must be from season 3.
explain
MATCH (p:Person)-[r:INTERACTS]-()
WHERE r.book = 3
WITH collect(DISTINCT p) as book3Ppl
UNWIND book3Ppl as personA
UNWIND book3Ppl as personB
WITH personA, personB
WHERE id(personA) < id(personB)
MATCH path = shortestPath((personA)-[:INTERACTS*]-(personB))
WHERE all(rel in relationships(path) WHERE rel.book = 3)
RETURN length(path) as len, [x in nodes(path) | x.name] as path
ORDER BY len DESC
LIMIT 5
04-14-2020 04:33 PM
Thank you, Andrew. This is a really good solution.
I tried:
match (a:Person), (b:Person) where id(a) > id(b)
match p = shortestPath((a)-[:INTERACTS*]-(b)) where all(r in relationships(p) where r.book = 3)
return length(p) as len, extract(x IN nodes(p) | x.name) as path order by len desc limit 5
this code can run the result, but since it doesn't filter the node first, it took 56000 ms compared to yours 12000 ms. However, I still wonder if there is a better way to filter the relationship by r.book = 3 only once. Thanks again.
04-14-2020 05:45 PM
You could create a fulltext schema index on :INTERACTS relationships by their book
property, that would allow you to use the fulltext schema query proc to do an index lookup on the relationships, that may be a faster way to initially retrieve the persons who interacted in book 3.
Or if you wanted you could run a similar MATCH and add labels to nodes based upon their appearance in the books, like :IN_BOOK_3 and so on for the other books.
Once those labels are in place, then to find your book 3 characters just MATCH to them using that label.
04-14-2020 05:57 PM
Thank you, I will try them.
All the sessions of the conference are now available online