Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-05-2020 11:19 AM
For example
MATCH path = shortestPath((s {
http://dbpedia.org/ontology/uri:"http://identifiers.org/uniprot/P0DTC2-PRO_0000449649"})-[*]-(humanprot)) WHERE humanprot.
http://dbpedia.org/ontology/uri IN ["http://identifiers.org/hgnc/15022"] RETURN path
finishes in about a second, while
MATCH (s {
http://dbpedia.org/ontology/uri:"http://identifiers.org/uniprot/P0DTC2-PRO_0000449647"})-[*]-(humanprot) WHERE humanprot.
http://dbpedia.org/ontology/uri IN ["http://identifiers.org/hgnc/15022"] RETURN * LIMIT 1
runs for ages and is still running while I am writing this.
Why if I remove the shortestPath() function from the query and put there LIMIT 1 instead, it becomes suddenly so much more complicated? Intuitively, it seems that finding any path should be faster than finding shortest path.
11-06-2020 02:28 PM
Actually neither of these is guaranteed, and I'd put it up to luck that one ran faster than another.
You're not using labels here, which is the major problem, so this is starting with an AllNodesScan, and in one of the queries it just happened to be the first to stumble upon a node that was the start of one of these paths, and (for shortest path) happened to find first via the cartesian product (another all nodes scan) another node for which the path exists.
You need to use labels in your patterns, and you need to create indexes or unique constraints to support even faster lookup.
All the sessions of the conference are now available online