Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
08-22-2018 07:43 PM
When a Cypher statement is first submitted Neo4j will attempt to determine if the query is in the plan cache before planning it.
By default Neo4j will keep 1000 query plans in cache based upon the conf/neo4j.conf
parameter of
dbms.query_cache_size.
In fact this actually represents 2 query plan caches.
When Cypher is initially submitted, the Cypher statement will have a hash computed on the string as-is. Using this resultant
hash value we will attempt to determine if statement already exists in the plan cache and if it does then re-planning may not
be necessary.
Note however that statements that are logically the same but differing in case will produce a different hash. The following
2 statement, though semantically equivalent, will produce a different hash and a replan may be necessary.
match (n) return count(n);
MATCH (n) return COUNT(n);
Additionally, statements that are logically the same but differing in whitespace/carriage returns will produce different hash values.
The following 3 statements will produce a different hash and a replan may be necessary
MATCH (n) return COUNT(n);
MATCH (n) return COUNT(n);
MATCH (n)
return COUNT(n);
Cypher statements prefaced by PROFILE/EXPLAIN
will have their PROFILE/EXPLAIN
removed before the statement is hashed. The following
2 statements will hash to the same value
MATCH (n) return COUNT(n);
PROFILE MATCH (n) return COUNT(n);
If the Cypher statements hash value is not found in the first cache Neo4j will then attempt to determine if it is in the 2nd cache.
The Neo4j compiler parses the query from a string to an abstract syntax tree (AST), which is an object represenation of the query.
The optimizer then normalizes the statement so as to make planning easier. For example
will be normalized to
match (n:Person) where n.id={param1} return n; {param1: 101}
in this case Neo4j has moved the predicate {id:101}
from the MATCH
pattern to the WHERE
clause, and has
parameterized 101
value into a parameter, e.g. n.id={param1}
. Usage of parameters is further detailed
here
The AST doesn't store information such as white spaces and casing of keywords, and since it has been parameterized, literal values
can change but still produce the same AST.
This second query cache is keyed on this normalized AST. i.e. these queries will re-use the same query plan.
match (n:Person) where n.id=101 return n;
match (n:Person {id:101}) return n;
MATCH ( n:Person { id : 101 } )
RETURN n;
Finally should the Cypher statement be found in either the 1st or 2nd cache the query may still be subject to being replanned
based upon conf/neo4j.conf
parameters of
cypher.min_replan_interval
and cypher.statistics_divergence_threshold
cypher.min_replan_interval
is used to define the duration, defaulting at 10 seconds, a cached plan exists before it is eligible for replanning
cypher.statistics_divergence_threshold
is used to indicate what percent of the statistcs for the underlying data used by the Cypher has changed.
The default value us 0.75 which would indicate if the statistics in the object have changed by more than 75% since the last
time thhe cached plan was generated then a new plan would need to be generated.
For example running
<!-- // remove all :Person nodes -->
match (n:Person) detach delete n;
<!-- // create 10 :Person nodes -->
foreach (x in range (1,10) | create (n:Person {id:x}));
<!-- // list the 10 :Person nodes created -->
match (n:Person) return n.id order by n.id desc;
<!-- // create 8 new :Person nodes -->
foreach (x in range (11,18) | create (n:Person {id:x}));
<!-- // list the 18 :Person nodes -->
match (n:Person) return n.id order by n.id desc;
----
The 2 `match (n:Person) return n.id order by n.id desc;` would each be planned and specifically the 2nd instance although
having the same hash value, the statistics on :Person had changed from 10 nodes to 18 nodes and thus exceeding the 75% change.
If an existing plan needs to be replanned as a result of the above 2 parameters the `logs/debug.log` will log
2017-03-31 19:14:27.820+0000 INFO [o.n.c.i.ExecutionEngine] Discarded stale query from the query cache: match (n:Person)
return n.id order by n.id desc;
2017-03-31 19:14:27.821+0000 INFO [o.n.c.i.EnterpriseCompatibilityFactory] Discarded stale query from the query cache: match
(n:Person) return n.id order by n.id desc;
Additionally it should be noted that when a query plan is removed from the cache so as to make room for a new plan a least frequently
used (LFU) algorithm. So if the first query added to the plan cache is run every 1 second, and the 2nd query added to the query plan
cache is added every 2 minutes, then when we need to remove a query plan from the cache to make room for a new query, we will remove
the 2nd query before the 1st since the first is more frequently called upon.
Finally it should be noted that any schema changes, for example index/constraint creation/removal will flush the entire query plan
cache.
Solved! Go to Solution.
02-19-2021 11:04 AM
thank you for this detail and that its 3.5.14. Per Configuration settings - Operations Manual which lists all the neo4j.conf settings there is no
dbms.min_replan_interval
rather it was renamed and is at cypher.min_replan_interval
Configuration settings - Operations Manual
11-24-2019 11:21 PM
In 3.5 Version, it seems abandon the dbms.query_cache_size
, it use the dbms.memory.pagecache.size
, it is just a new name ? What is the difference?
11-25-2019 03:33 AM
Abandon ???? How so ??
These are 2 distinct and a bit unrelated parameters.
dbms.memory.pagecache.size represents the amount of ram reserved to have your you graph 'data' recorded.
The other parameter describes the number of query plans cached off so that every time a query is submitted it will not always be replanned
11-26-2019 05:13 AM
But in 3.5 version, the conf/neo4j.conf
does not have the dbms.query_cache_size
parameter. How to set a different value? Just add a new line dbms.query_cache_size
?
11-26-2019 05:37 AM
yes. You can add/remove parameters from neo4j.conf.
there are many other parameters not included in a default conf/neo4j.conf and as such if not included then Neo4j will use the default and for which per https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/#config_dbms.query... the default for this parameter is 1000. Are you having a need to increase it over and above 1000?
11-26-2019 07:20 PM
Sorry to ask for the Scientis data
in "https://neo4j.com/docs/cypher-manual/3.5/query-tuning/using/" here, but I do not where to ask for the data. I would test the examples with the data. Thanks a lot!
11-27-2019 10:02 AM
Here is the code for that query's data:
CREATE INDEX FOR (n:Scientist) ON (n.name);
CREATE INDEX FOR (n:Science) ON (n.name);
CREATE
|(liskov:Scientist {name: 'Liskov', born: 1939})-[:KNOWS]->(wing:Scientist {name: 'Wing', born: 1956})-[:RESEARCHED]->(cs:Science {name: 'Computer Science'})<-[:RESEARCHED]-(conway:Scientist {name: 'Conway', born: 1938}),
|(liskov)-[:RESEARCHED]->(cs),
|(wing)-[:RESEARCHED]->(:Science {name: 'Engineering'}),
|(chemistry:Science {name: 'Chemistry'})<-[:RESEARCHED]-(:Scientist {name: 'Curie', born: 1867}),
|(chemistry)<-[:RESEARCHED]-(:Scientist {name: 'Arden'}),
|(chemistry)<-[:RESEARCHED]-(:Scientist {name: 'Franklin'}),
|(chemistry)<-[:RESEARCHED]-(:Scientist {name: 'Harrison'});
Elaine
02-19-2021 10:10 AM
Are non default values of cypher.min_replan_interval available for community edition.
I changed the value to 60s but when I check via CALL dbms.listConfig(), it is still list 10s.
Thanks
02-19-2021 10:35 AM
what version of Neo4j ? ?????
02-19-2021 10:46 AM
"Neo4j Kernel" "3.5.14" "community"
root@2062615c8f18:/var/lib/neo4j/conf# cat neo4j.conf | grep min
dbms.min_replan_interval=60s
I tried to config it to 60s. neo4j.conf show 60s. CALL dbms.listConfig() still shows 10000ms
02-19-2021 11:04 AM
thank you for this detail and that its 3.5.14. Per Configuration settings - Operations Manual which lists all the neo4j.conf settings there is no
dbms.min_replan_interval
rather it was renamed and is at cypher.min_replan_interval
Configuration settings - Operations Manual
02-19-2021 11:32 AM
Works!
Thanks Dana. Appreciate it.
All the sessions of the conference are now available online