Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-16-2021 04:54 AM
Hello Graph Community,
I have been recently working on tuning the performance of a full-text search query written in Cypher. Since the query contained text conditions such as CONTAINS, it is suggested to use a FULLTEXT index to speed up the performance of the query. Following the same, query written using FULLTEXT index, when profiled, showed far fewer DB hits than what was there in the original query I wrote. This query however could stand the performance test and performed 5 times slower than the original query.
Original Cypher Query
MATCH path = (a:TOP_LEVEL)-[:Child *1..]->(q)-[:Child *0..]->(d)
WHERE q.name CONTAINS 'A1' AND NOT EXISTS {
MATCH (p)-[:Child*]->(q)-[:Child*]->(r:LEAF)
WHERE p.name CONTAINS $text AND
$project_id in r.project_id
}
RETURN path
New Query
Now I went further and wrote another query using FULLTEXT indexes. The Cypher query generated was as follows -
CALL db.index.fulltext.queryNodes('full_text_idx', '*A1*') YIELD node
WITH node as q
MATCH path = (a:TOP_LEVEL)-[:Child*]->(q)-[:Child*0..]->(d:LEAF)
WHERE EXISTS {
MATCH (q)-[r:Child]->()
WHERE $project_id in r.project_ids} OR (('LEAF' IN labels(q) OR 'ROLLUP' IN labels(q)) AND $project_id in q.project_id)
RETURN path
The new query took around 700 DB hits. This provided strong evidence that the new query should be faster than the old query by manifolds.
Results
When tested the first query easily outperformed the new query. While the new query took 90ms-100ms on average, the old query was able to give the same results in 20ms-30ms.
I am not able to figure out why this can happen.
07-23-2021 12:34 PM
Best I can tell, the two queries as pasted here won't return the same result set. An apples to apples comparison would be useful to do first.
For example, the second query doesn't even do this test " WHERE p.name CONTAINS $text", and has other new conditions.
Have you used PROFILE on the queries to look at the row count difference? dbhits is interesting but row counts processed may be more important.
All the sessions of the conference are now available online