cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Does it matter if you put your `WHERE` predicates next to your `MATCH` clauses?

Does it matter if you put your WHERE predicates next to your MATCH clauses?

The other day a colleague asked me if this is a reasonable Cypher query

MATCH(p:Person {id:$person_id})
MATCH(doc:Document)
MATCH(p)-[:READ]->(doc)
WHERE doc.id IN $resource_ids
RETURN collect(doc.id) AS read_docs;

Now, to me this smells weird since the WHERE and its MATCH clause are separated by another pattern MATCH. I'm basing that on this quote from the docs


"MATCH is often coupled to a WHERE part which adds restrictions, or predicates, to the MATCH patterns, making them more specific. The predicates are part of the pattern description, and should not be considered a filter applied only after the matching is done. This means that WHERE should always be put together with the MATCH clause it belongs to."

As a good engineer I set up a toy example half expecting the query to come up with a subpar plan, but the profile actually ended up with a very reasonable "Look up p by id, expand its :READ relationships and filter the resulting doc nodes by id".

So what gives? Does it actually matter where you put your WHERE clauses? Does it only matter sometimes? Or is it more a matter of that the guarantee of a good query plan gets weaker as you make your initial query less "strict".
NB: The query I would have gone with is


MATCH(p:Person {id:$person_id})-[:READ]->(doc)
WHERE doc.id IN $resource_ids
RETURN collect(doc.id) AS read_docs;

 

2 REPLIES 2

glilienfield
Ninja
Ninja

I don’t think it matters. Look at the ‘explain’ plan for these quietude.  You will notice it will use whatever portion of the where clause when it needs for evaluation a set of match clauses. 

The query plan does look correct, but my theory is that it is dependent on the graph and that not grouping `MATCH and WHERE is more likely to result in a sub-optimal plan. If not then why would it be in the documentation?