Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-24-2021 05:41 AM
Hello!
I feel like I'm encountering some unintuitive behavior that I'm trying to explain.
My database has about 50k nodes, and 200k relationships. I have two queries with wildly different performance characteristics.
Query 1:
MATCH p=(x:Element {name: "Target"})<-[:Has|Belongs*]-(y) RETURN y
This computes with 5200 total hits in 65ms.
Query 2:
MATCH p=(x:Element {name: "Target"})<-[:Has|Belongs*]-(y:Node) RETURN y
This computes with 8,813,850 db hits in 32396ms
I would have expected that the second query would have less computation time since the set of source nodes is restricted to a specific label. Am I missing something?
11-24-2021 06:11 AM
Hi @rookuu !
Can you share the Explain of each query?
Second one adds for sure a Filter on y nodes.
Bennu
11-24-2021 06:54 AM
Query 1:
NodeIndexSeek x:Element(name) WHERE name = $
VarLengthExpand (x)<-[anon_0:Has|Belongs*]-(y)
ProduceResults y
Query 2:
NodeIndexSeek x:Element(name) WHERE name = $ && NodeByLabelScan y:Node
CartesianProduct x, y
VarLengthExpand (x)<-[anon_0:Has|Belongs*]-(y)
ProductResults y
Apologies for the notation, the difference being that in Query 2, it runs NodeByLabelScan y:Node
at the same time as NodeIndexWeek x:Element(name)
then feeds that into CartesianProduct
.
Profiling both queries tells me that it's the VarLengthExpand
that differs wildly in db hits from about 4.5k to 9 million hits.
11-25-2021 02:07 AM
Hi @rookuu !
Clearly the problem is that the Query planner is using a NodeByLabelScan plus Cartesian Product instead of Expanding on x and filtering on y afterwards. Which version of Neo4J are you using?
Can you try:
MATCH (x:Element {name: "Target"})
WITH x
MATCH p=(x)<-[:Has|Belongs*]-(y:Node)
RETURN y
Bennu
PS: Next time, a screenshot of the planner could be easier for both of us.
All the sessions of the conference are now available online