Neo4j

quan_nguyen · ‎12-12-2019

Hi all,

I am wondering about the performance of the following string comparison in where clause:
Case 1:

MATCH (p: Person)
WHERE NOT p.name='Person 1'
AND NOT p.name='Person 2'
RETURN p

Case 2:

MATCH (p: Person)
WHERE NOT (p.name='Person 1' OR p.name='Person 2')
RETURN p

Case 3:

MATCH (p: Person)
WHERE NOT p.name IN ['Person 1', 'Person 2']
RETURN p

I have tried on my database but the result is not united, sometimes case 3 gets be best response time and db hits, sometimes the best is case 1 and case 2.
Can you please make this clear for me?

astridparamita · ‎12-13-2019

Hi @quan.nguyen,

Have you tried using the keyword EXPLAIN and PROFILE on before your query?

EXPLAIN will not run the query, but show you the predicted cost of the query
PROFILE will run the query and show the cost of the query

In the your case, the Case 1 & Case 2 should perform with around the same cost. Basically it would scan all node label Person two times, one time for 'Person 1' the second time for 'Person 2'.

In Case 3 it will only scan all the node label Person once, checking both string 'Person 1' and 'Person 2' in one go.

The query Case 3 with the IN keyword would perform even better in comparison with the other two when you actually have more strings to compare (e.g p.name IN ['Person 1', 'Person 2, 'Person 3']

Hope this helps you with your decision!

View solution in original post

12kunal34 · ‎12-12-2019

Hi @quan.nguyen

in this case i have one suggestion for you for the best performance .

if you want to searchinf for node on the basis of name then you should create index on this property
create index on :node(name)

this would resolve all your issues regarding db hits

please let me know if any other info required

Cheers

quan_nguyen · ‎12-12-2019

Thank you for your reply, @12kunal34

My case is more complicated and this is my example for the comparison. I have to consider this because my the property, that I need to filter out, is rarely used and it's not so selective (it has less then 10 different values).

Do you have any suggestions for this?

astridparamita · ‎12-13-2019