‎12-12-2019 07:16 PM
Hi all,
I am wondering about the performance of the following string comparison in where clause:
Case 1:
MATCH (p: Person)
WHERE NOT p.name='Person 1'
AND NOT p.name='Person 2'
RETURN p
Case 2:
MATCH (p: Person)
WHERE NOT (p.name='Person 1' OR p.name='Person 2')
RETURN p
Case 3:
MATCH (p: Person)
WHERE NOT p.name IN ['Person 1', 'Person 2']
RETURN p
I have tried on my database but the result is not united, sometimes case 3 gets be best response time and db hits, sometimes the best is case 1 and case 2.
Can you please make this clear for me?
Solved! Go to Solution.
- Labels:
-
Cypher
‎12-13-2019 08:27 AM
Hi @quan.nguyen,
Have you tried using the keyword EXPLAIN
and PROFILE
on before your query?
-
EXPLAIN
will not run the query, but show you the predicted cost of the query -
PROFILE
will run the query and show the cost of the query
In the your case, the Case 1 & Case 2 should perform with around the same cost. Basically it would scan all node label Person two times, one time for 'Person 1' the second time for 'Person 2'.
In Case 3 it will only scan all the node label Person once, checking both string 'Person 1' and 'Person 2' in one go.
The query Case 3 with the IN
keyword would perform even better in comparison with the other two when you actually have more strings to compare (e.g p.name IN ['Person 1', 'Person 2, 'Person 3']
Hope this helps you with your decision!
‎12-12-2019 10:15 PM
Hi @quan.nguyen
in this case i have one suggestion for you for the best performance .
if you want to searchinf for node on the basis of name then you should create index on this property
create index on :node(name)
this would resolve all your issues regarding db hits
please let me know if any other info required
Cheers
‎12-12-2019 10:55 PM
Thank you for your reply, @12kunal34
My case is more complicated and this is my example for the comparison. I have to consider this because my the property, that I need to filter out, is rarely used and it's not so selective (it has less then 10 different values).
Do you have any suggestions for this?
‎12-13-2019 08:27 AM
Hi @quan.nguyen,
Have you tried using the keyword EXPLAIN
and PROFILE
on before your query?
-
EXPLAIN
will not run the query, but show you the predicted cost of the query -
PROFILE
will run the query and show the cost of the query
In the your case, the Case 1 & Case 2 should perform with around the same cost. Basically it would scan all node label Person two times, one time for 'Person 1' the second time for 'Person 2'.
In Case 3 it will only scan all the node label Person once, checking both string 'Person 1' and 'Person 2' in one go.
The query Case 3 with the IN
keyword would perform even better in comparison with the other two when you actually have more strings to compare (e.g p.name IN ['Person 1', 'Person 2, 'Person 3']
Hope this helps you with your decision!