cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Yen's Shortest Path throws error when applied to projection without parallel edges

Jammer
Node Link

I'm trying to get the top x unique shortest node paths. If I use a simple projection:

CALL gds.graph.project("graphSingleType", "*", {TYPE: {type: "*"}})

Yen's will return the same node path multiple times having taken different parallel edges between the nodes.

So I make a graph projection without parallel edges:

CALL gds.graph.project("graphSingleType_NP", "*", {TYPE:{type:'*', aggregation:"SINGLE"}})

The problem is that Yen's then only works when k=1 (ie when Dijkstra's is used). For all k>1, The following error is thrown:

Failed to invoke procedure `gds.shortestPath.yens.stream`: Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for length 0

 I found the issue in GDS version 2.1 so I updated to 2.3 but the problem is still there. 

Am I not understanding something or is this an actual bug? Alternatively, is there another way I can get the top x unique node paths?

 

To reproduce: 

CREATE (a:CITY),
		(b:CITY),
		(c:CITY),
		(d:CITY),
		(e:CITY),
		(f:CITY),
		(a)-[:ROAD]->(b),
		(a)-[:ROAD]->(b),
		(b)-[:ROAD]->(c),
		(b)-[:ROAD]->(d),
		(c)-[:ROAD]->(f),
		(d)-[:ROAD]->(e),
		(e)-[:ROAD]->(c),
		(e)-[:ROAD]->(f),
		(a)-[:PATH]->(b),
		(d)-[:PATH]->(e),
		(d)-[:PATH]->(e)
MATCH (source), (target) WHERE id(source)=0 AND id(target)=5
CALL gds.shortestPath.yens.stream("graphSingleType_NP",
                              {sourceNode:source, 
                               targetNode:target,
                               k:3})
4 REPLIES 4

Jammer
Node Link

I found a work around: use a cypher projection rather than a native projection. 

From the example above, I get the following graph:

2023-02-03_12h55_07.png

I then create my graph projection, "graphSingleNodeType_NP", a single graph without parallel edges.

CALL gds.graph.project.cypher(
  'graphSingleType_NP',
  'MATCH (n) RETURN id(n) AS id',
  'MATCH (n)-[r]->(m) RETURN DISTINCT id(n) AS source, id(m) AS target'
)

The projection removes all parallel edges, in the same way the native projection did.

I then run Yen's algorithm.

MATCH (source), (target) WHERE id(source)=0 AND id(target)=5
CALL gds.shortestPath.yens.stream("graphSingleType_NP",
                              {sourceNode:source, 
                               targetNode:target,
                               k:10}) YIELD nodeIds RETURN nodeIds

 And get the desired results: in this case the top 10 unique node paths. However, since no more than 3 unique node paths exist between nodes 0 and 5, only 3 paths are returned.

2023-02-03_12h59_40.png

 

It's probably worth mentioning that I found Native projections to be notably faster than Cypher Projections when used on larger Neo4j databases. 

Also, I found this reference to be very helpful in understanding GDS graph projections  https://tbgraph.wordpress.com/2020/03/30/analyzing-multigraphs-in-neo4j-graph-data-science-library for 

Thank you for the nice write up and reproducible example. This looks like a bug to me, one that is related to the parallel aggregation.  I will go ahead and report it.  

For future reference, we consider most  java.lang.* exceptions to be bugs. Feel free to report them here: https://github.com/neo4j/graph-data-science/issues

You are correct that native projections are significantly faster for large graphs than Cypher projections.

Hi @Jammer ,

Thank you for bringing this issue to our attention. We have identified the issue, we are working on a fix. We  will let you know when a patch with the fix is released.

Best regards,

Ioannis.

Jammer
Node Link

What I have actually ended up doing:

So for my real database, a cypher projection is too slow (I gave up after 40 minutes). However, GDS 2.3 has added a handy function "toUndirected". Since I want the option of finding the shortest directed or undirected paths, this function handles half the battle for me. For now, I'll just use Dijkstra's for the directed option.

#1) Make a single graph projection with one edge type (one rel type, no parallel edges)

CALL gds.graph.project("graphSingleType_NP", "*", {TYPE:{type:'*', aggregation:"SINGLE"}})

#2) Add and "undirected" edge type to the projection

CALL gds.beta.graph.relationships.toUndirected(
  'graphSingleType_NP',
  {relationshipType: 'TYPE', mutateRelationshipType: 'TYPE_UD'}
)
YIELD inputRelationships, relationshipsWritten
RETURN inputRelationships, relationshipsWritten

#3) Calculating Yen's on the undirected edge type works

MATCH (source), (target) WHERE id(source)=31 AND id(target)=26
CALL gds.shortestPath.yens.stream("graphSingleType_NP",
							  {sourceNode:source, 
							   targetNode:target,
							   k:3,
							   relationshipTypes:['TYPE_UD']
							}) YIELD nodeIds RETURN nodeIds
							
#4) Calculating Yen's on the original type still fails:

MATCH (source), (target) WHERE id(source)=31 AND id(target)=26
CALL gds.shortestPath.yens.stream("graphSingleType_NP",
							  {sourceNode:source, 
							   targetNode:target,
							   k:3,
							   relationshipTypes:['TYPE']
							}) YIELD nodeIds RETURN nodeIds

 

And thanks for the responses. Nice knowing this will probably be fixed in the future. Cheers!