Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-12-2019 07:16 PM
Hi all,
I am wondering about the performance of the following string comparison in where clause:
Case 1:
MATCH (p: Person)
WHERE NOT p.name='Person 1'
AND NOT p.name='Person 2'
RETURN p
Case 2:
MATCH (p: Person)
WHERE NOT (p.name='Person 1' OR p.name='Person 2')
RETURN p
Case 3:
MATCH (p: Person)
WHERE NOT p.name IN ['Person 1', 'Person 2']
RETURN p
I have tried on my database but the result is not united, sometimes case 3 gets be best response time and db hits, sometimes the best is case 1 and case 2.
Can you please make this clear for me?
Solved! Go to Solution.
12-13-2019 08:27 AM
Hi @quan.nguyen,
Have you tried using the keyword EXPLAIN
and PROFILE
on before your query?
EXPLAIN
will not run the query, but show you the predicted cost of the queryPROFILE
will run the query and show the cost of the queryIn the your case, the Case 1 & Case 2 should perform with around the same cost. Basically it would scan all node label Person two times, one time for 'Person 1' the second time for 'Person 2'.
In Case 3 it will only scan all the node label Person once, checking both string 'Person 1' and 'Person 2' in one go.
The query Case 3 with the IN
keyword would perform even better in comparison with the other two when you actually have more strings to compare (e.g p.name IN ['Person 1', 'Person 2, 'Person 3']
Hope this helps you with your decision!
12-12-2019 10:15 PM
Hi @quan.nguyen
in this case i have one suggestion for you for the best performance .
if you want to searchinf for node on the basis of name then you should create index on this property
create index on :node(name)
this would resolve all your issues regarding db hits
please let me know if any other info required
Cheers
12-12-2019 10:55 PM
Thank you for your reply, @12kunal34
My case is more complicated and this is my example for the comparison. I have to consider this because my the property, that I need to filter out, is rarely used and it's not so selective (it has less then 10 different values).
Do you have any suggestions for this?
12-13-2019 08:27 AM
Hi @quan.nguyen,
Have you tried using the keyword EXPLAIN
and PROFILE
on before your query?
EXPLAIN
will not run the query, but show you the predicted cost of the queryPROFILE
will run the query and show the cost of the queryIn the your case, the Case 1 & Case 2 should perform with around the same cost. Basically it would scan all node label Person two times, one time for 'Person 1' the second time for 'Person 2'.
In Case 3 it will only scan all the node label Person once, checking both string 'Person 1' and 'Person 2' in one go.
The query Case 3 with the IN
keyword would perform even better in comparison with the other two when you actually have more strings to compare (e.g p.name IN ['Person 1', 'Person 2, 'Person 3']
Hope this helps you with your decision!
All the sessions of the conference are now available online