cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Finding local node by property and shortestPath to it from a list

Hey,

I currently have an application that is making individual Cypher calls for each ID that I want to first find a local node to path to and then calculate the shortestPath. This can be called anywhere from 1 to 100 times a minute and can take nearly half a second to iterate through from the application, I'd like to scale this up to thousands of calls eventually so i'm looking at ways I can optimise this by batching up the IDs into a list and calling the query once with all the IDs.

The current single call query looks as below, to clarify some of the points SystemNodes are related to SystemNodes by ROUTE and a SystemObjectNode is related by being IN a single SystemNode. This property Orange is set against a SystemObjectNode.

MATCH (s:SystemNode {system_id: '" + str(start_sys) + "'})
CALL apoc.neighbors.byhop(s, 'IN|ROUTE', 3)
YIELD nodes
UNWIND nodes AS n
WITH n
WHERE n.type = 'Orange'
WITH n AS systemobj
LIMIT 1
MATCH (s1:SystemNode {system_id: '" + str(start_sys) + "'}), path = shortestPath((s1)-[:IN|ROUTE*..3]-(systemobj))
RETURN nodes(path) AS nodes, systemobj.systemobject_id

In the above," + str(start_sys) + " is replaced by the application with the ID to look-up. The return value is the list of nodes to route through and the target object.

I've got a fair bit of experience with SQL but I am very new to Cypher so still getting my head around some of the concepts. My current attempts have been to pass a list of IDs and unwind these, it works until I perform the neighbours call which is then breaking it from distinct rows of IDs I want to look up to a full list of all neighbouring nodes. If this was SQL I'd be looking to a subquery or JOIN but I can't see to find a pattern in Cypher that would help me achieve what I'm after, if its possible at all.

First attempt at using a list (hard-coded for testing) on shortestPath;

UNWIND ['80406', '80626'] AS sysid
MATCH p=shortestPath((s1:SystemNode {system_id: sysid})-[:ROUTE*]-(s2:SystemNode {system_id: '80629'}))
RETURN NODES(p) as nodes

 Works as expected, I get two sets of nodes. If I try to lead into this with some of the original code I just end up with a single merged list of nodes.

Any suggestions or hints in the right direction would be appreciated.

1 ACCEPTED SOLUTION

First, welcome aboard. I think you will find neo4j and cypher at lot of fun. I suggest the next topics you learn of 'call subqueries', 'list comprehension', and 'map projections'. 

https://neo4j.com/docs/cypher-manual/current/clauses/call-subquery/

https://neo4j.com/developer/cypher/subqueries/

https://neo4j.com/docs/cypher-manual/current/syntax/lists/

https://neo4j.com/docs/cypher-manual/current/syntax/maps/

I refactored your query using list comprehension to show its use. I also was able to eliminate the second match to get the SystemNode you already got with your first match. Also, you will want to pass system_id as a parameter. This allows the query planner to reuse the same query plan for different values of system_id. 

https://neo4j.com/docs/cypher-manual/current/syntax/parameters/

 

MATCH (s:SystemNode {system_id: $system_id})
CALL apoc.neighbors.byhop(s, 'IN|ROUTE', 3) YIELD nodes
WITH s, apoc.coll.flatten(collect(nodes), false) as sNodes
WITH s, [n in sNodes where n.type = 'Orange'][0] as systemobj
MATCH path = shortestPath((s)-[:IN|ROUTE*..3]-(systemobj))
RETURN nodes(path) AS nodes, systemobj.systemobject_id

 Assuming I did not make a mistake with the above query, the following will work with a list of 'system_id' values sent as a parameter. You should get a line for each system_id, with its corresponding nodes from its shortest path and its systemobj node. 

UNWIND $system_ids as system_id
MATCH (s:SystemNode {system_id: system_id})
CALL apoc.neighbors.byhop(s, 'IN|ROUTE', 3) YIELD nodes
WITH s, apoc.coll.flatten(collect(nodes), false) as sNodes
WITH s, [n in sNodes where n.type = 'Orange'][0] as systemobj
MATCH path = shortestPath((s)-[:IN|ROUTE*..3]-(systemobj))
RETURN s.system_id as system_id, nodes(path) AS nodes, systemobj.systemobject_id

View solution in original post

2 REPLIES 2

First, welcome aboard. I think you will find neo4j and cypher at lot of fun. I suggest the next topics you learn of 'call subqueries', 'list comprehension', and 'map projections'. 

https://neo4j.com/docs/cypher-manual/current/clauses/call-subquery/

https://neo4j.com/developer/cypher/subqueries/

https://neo4j.com/docs/cypher-manual/current/syntax/lists/

https://neo4j.com/docs/cypher-manual/current/syntax/maps/

I refactored your query using list comprehension to show its use. I also was able to eliminate the second match to get the SystemNode you already got with your first match. Also, you will want to pass system_id as a parameter. This allows the query planner to reuse the same query plan for different values of system_id. 

https://neo4j.com/docs/cypher-manual/current/syntax/parameters/

 

MATCH (s:SystemNode {system_id: $system_id})
CALL apoc.neighbors.byhop(s, 'IN|ROUTE', 3) YIELD nodes
WITH s, apoc.coll.flatten(collect(nodes), false) as sNodes
WITH s, [n in sNodes where n.type = 'Orange'][0] as systemobj
MATCH path = shortestPath((s)-[:IN|ROUTE*..3]-(systemobj))
RETURN nodes(path) AS nodes, systemobj.systemobject_id

 Assuming I did not make a mistake with the above query, the following will work with a list of 'system_id' values sent as a parameter. You should get a line for each system_id, with its corresponding nodes from its shortest path and its systemobj node. 

UNWIND $system_ids as system_id
MATCH (s:SystemNode {system_id: system_id})
CALL apoc.neighbors.byhop(s, 'IN|ROUTE', 3) YIELD nodes
WITH s, apoc.coll.flatten(collect(nodes), false) as sNodes
WITH s, [n in sNodes where n.type = 'Orange'][0] as systemobj
MATCH path = shortestPath((s)-[:IN|ROUTE*..3]-(systemobj))
RETURN s.system_id as system_id, nodes(path) AS nodes, systemobj.systemobject_id

Thats great, thank you. Your suggested update worked first time and has definitely helped me get my head around some of syntax issues I've been hitting.

I've been making updates to it and have now spun out an alternative version for other functions that I use just for routing between SystemNodes which have a relationship of ROUTE (dropped the neighbors call and the IN relationship). Are there any good documents around how Cypher evaluates indexes on relationships? I have indexes on system_id and systemobject_id properties and I can see them being used in my searches but my index on the ROUTE index doesn't appear to be being invoked. From the docs and blog post on it I can only imagine its being ignored as my relationships are fairly straightforward and the same relationship type is shared across all SystemNodes.

... that said, the search is returning in 5-15ms so I'm not sure I'm going to get much more performance out of it!