Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-05-2021 10:40 AM
Hey all, I'm running up against an unexpectedly slow Cypher query, and I've looked into several other posts but I'm still not sure what to do about my graph to make it more performant.
In my query, I'm trying to find "unreserved" siblings of the origin
node. Essentially, I want to walk up a :CHILD
relationship to the (one) parent node, then find all of that parent's children that are not reserved
.
My data represents the structure of a document, which has nested levels:
The data model is very simple:
(:Level)-[:CHILD]->(:Level)
And here's the query that's giving me trouble:
MATCH (origin { type: 'part', identifier: '182' })<-[:CHILD]-
(parent)-[:CHILD]->(siblings)
WHERE siblings.reserved = false
RETURN siblings
When siblings.reserved = false
is in the query, it takes > 5 seconds, even though there are only 12 nodes that need to be checked for whether or not they are reserved
. When set to true
, or when the WHERE
clause is removed, the query takes about 20ms.
My understanding from this other post is that the issue is having to scan all possible nodes to negate them.
There there are only 12 siblings that could match this relationship, given an origin node representing of Part 182. However, the query planner does the siblings.reserved
part of the query first, so it's checking ~52,000 nodes.
My question, then, is: what do I need to do to bound the query such that the 12 possible siblings are the only ones who are checked for reserved = false
?
And/or is there another way to make this lookup significantly more performant, with respect to the boolean attribute?
01-05-2021 11:08 AM
Happy to report I may have solved the issue!
To limit the siblings that get searched for reserved = false
, I set up a subquery in the CALL
block.
CALL {
MATCH (origin { type: 'part', identifier: '182' })<-[:CHILD]-(parent)-[:CHILD]->(siblings)
RETURN siblings
}
WITH siblings
WHERE siblings.reserved = false
RETURN siblings
The query planner shows this searching about ~475 nodes for the boolean attribute, instead of ~50,000.
01-05-2021 11:11 AM
[added]. Your solution is almost certainly better than mine. Very interesting.
[original suggestion]
One educated guess/suggestion I have, is to change the state of the property .reserved
Instead of .reserved being true
or false
, make the range of values to be true
and null
(doesn't exist) for false
Then do the test siblings.reserved is null
instead of siblings.reserved = false
My guess is that Neo4J needs to see if the property exists and if it exists, then extract the value if does (true or false). By making it null, then you only have to test if the property exists, which should be faster (but I'm not sure by how much.). You can also make the more likely property (true or false) to be the nullable value.
All the sessions of the conference are now available online