cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Warm the cache to improve performance from cold start

For Neo4j 2.3+ there is no object cache anymore, so this warms up the page-cache which maps the Neo4j store files into memory.

You may find that some queries run much faster the second time they run.
This is because on cold boot, a server node has nothing cached yet, and needs to go to disk for all records.
Once some/all of the records are cached, you will see greatly improved performance.

One technique that is widely employed is to "warm the cache".
At its most basic level, we run a query that touches each node and relationship in the graph.
Assuming the data store can fit into memory, this will cache the entire graph.
Otherwise, it will cache as much as it can.
Give it a try and see how it helps you!

Cypher (Server,Shell)

MATCH (n)
OPTIONAL MATCH (n)-[r]->() 
RETURN count(n.prop) + count(r.prop);

In the above example the reference to count(n.prop) + count(r.prop) is used so as to force the optimizer to search for a node/relationship with a property named 'prop'. Replacing this with count(*) would not be sufficient for it would not load all of the node and relationship properties.

Embedded (Java):

@GET @Path("/warmup")
public String warmUp(@Context GraphDatabaseService db) {
    try ( Transaction tx = db.beginTx()) {
        for ( Node n : GlobalGraphOperations.at(db).getAllNodes()) {
            n.getPropertyKeys();
            for ( Relationship relationship : n.getRelationships()) {
                relationship.getPropertyKeys();
                relationship.getStartNode();
            }
        }
    }
    return "Warmed up and ready to go!";
}

With 3.0 forward and the inclusion of APOC one can now warm up the cache
by running the stored procedure

CALL apoc.warmup.run()

This can help in many ways.
Aside for pure performance improvement, it can also help alleviate upstream issues resulting from lagging queries.
For example if the nodes are busy, and your load balancer/proxy has a very short timeout, it can appear that the cluster is not available initially, if none of the graph is in memory yet.
If the cache is warmed, the short timeout shouldn't be a concern on a cold cluster start.

0 REPLIES 0