Neo4j

david_allen · ‎03-17-2019

Working with customers, sometimes we need to simulate various workloads on Neo4j, for many different reasons. Here is how you can do that.

For example, we want to measure the response time or latency of queries, or we want to see how many queries a given hardware configuration can handle.

To help out with that, I wrote a JavaScript module called graph-workload. I use this for some internal benchmark testing, to help verify that Neo4j’s cloud distributions are working properly, and for other debugging tasks.

This article describes how to use it, how you can generate test data with it, and measure overall throughput. Questions? Comments? Come discuss this on the Neo4j Community Thread about graph-workload.

Generate light or heavy workloads with graph-workload!

Important tip: This tool performs writes to your database, of test data to simulate load. Do not run it against a production system, and please keep in mind it will modify your database!

Install and get started

We’ll use the javascript method. Head on over to the Github Repo and clone the repo, then install the dependencies.

git clone https://github.com/moxious/graph-workload.git
cd graph-workload
npm install

To run it, use the following command to get usage info. Below, we’ll explain what these options mean, and how to use them.

node src/run-workload.js --help

What’s a graph workload?

With this tool, a workload is the combination of two things:

Query (or query table)
Run configuration

Query table: A single Cypher query can work. A query table tells the app which strategies you want to run, and how frequently you want them run, by probability.

Run configuration: This tells the workload generator how much to do. It includes several settings, but the most important are:

Connection info: username, password, and address of your Neo4j instance.
Concurrency (default 10): How many queries to have in-flight at the same time. This is a key way we control how much load we place on the database.
One of two possible termination conditions: Either terminate after N queries have been run, or terminate after X milliseconds of total runtime, no matter how many have run.

Mixed workload example

If you start the program with defaults, it will choose a mixed workload that contains both reads and writes, and creates a fairly chaotic load pattern on your database at high levels of concurrency.

node src/run-workload.js -a my-neo4j-host.com -u neo4j -p secret \
     --ms 5000

This specifies to run queries against my-neo4j-host.com for 5000ms (which is 5 seconds).

The output looks like this: (with some benchmarking output omitted for space)

{ address: 'bolt://my-neo4j-host.com',
  username: 'neo4j',
  concurrency: 10,
  ms: 5000,
  checkpointFreq: 5000 }
Connecting to  bolt://my-neo4j-host.com
Creating session pool with  { min: 1, max: 10 }
Progress: 0.00% 0 completed; 0 running 0 error
Starting main promise pool
Starting timer at  2019-03-14T19:44:12Z  to expire at 2019-03-14T19:44:17Z after  5000 ms
Progress: 100.00% 1493 completed; 10 running 0 error
Timeout
Shutting down
{ complete: 1505, running: 0, errors: 0 }

On this run, we ended up running 1,505 queries in 5 seconds against this host.

Simple query example

If we want to run a single query over and over (for example to simulate new transactional records coming in), we can do that like this:

node src/run-workload.js -a localhost \
     -u neo4j \
     -p admin \
     --concurrency 55 \
     --ms 30000 \
     --query 'CREATE (order:ProductOrder { date: datetime(), someData: rand() });'

This is going to be a lot harder on our database. We’ll run 55 concurrent queries for 30 seconds. Each query will create new nodes. On our sample test database, we were able to run this query 62,939 times in 30 seconds, or about 2,100 nodes per second without really trying to optimize for speed.

By using Halin to monitor the Neo4j instance while the workload is running, we can see the traffic spike while those queries were active:

Monitoring our workload with Halin

(If you’d like to learn more about Halin, you can read this article)

Designing your own workload

Workloads can be specified as a simple JSON file that looks like this:

{
  [ 0.5, "randomLinkage" ],
  [ 1.0, "starWrite" ]
}

Graph-workload has a bunch of built-in strategies. The two strategy names are given as examples, the full list is described below. This strategy table tells the app to run the first strategy 50% of the time, and the second strategy 50% of the time. Basically, the app rolls a random number from 0–1.0 for each query it wants to run, and looks through the strategy table and picks according to this distribution.

Other available strategies

In your workload, you may specify any of these strategies, to mix and match design any kind of workload you like.

The intent is to be able to use some combination of these strategies to simulate the kind of work that your database needs to do. If none of these fit, you can always either run your own custom queries as above, or implement a small JavaScript class which will create a new strategy. Strategies may have custom “setup actions” (for example, create an index on an ID property prior to merging based on the ID property)

Read strategies

aggregateRead: This matches a semi-random collection of nodes and calculates min, max, and count using Cypher aggregate functions.
longPathRead: Matches and returns data which requires traversing long paths in the graph.
randomAccessRead: This matches a truly random subset of nodes and returns them. It is used to force the database to load things that aren’t in the page cache, and helps test I/O to and from disks.

Write strategies

fatNodeAppend: Creates batches of 10 nodes each time, where each node contains a property called “data” with a very long property.
indexHeavy: This strategy creates new nodes under a configuration where many different fields are indexed, forcing the database to continue maintaining the index as it goes.
mergeWrite: Does simple writes to the database using a common pattern of merging records based on an ID field.
nAryTree: Creates and expands an n-ary tree, creating long paths and branching structures with test data.
randomLinkage: Selects two random nodes and creates a -[:randomlinkage]-> relationship between them, making the graph more densely connected.
rawWrite: A simple CREATE statement that creates new nodes with data and relationships.
starWrite: Creates hub/spoke patterns in the graph.
writeProperty: MATCHes and updates nodes in the graph and updates their properties with new values.

Happy graph hacking!

Generating Test Workloads on Neo4j was originally published in neo4j on Medium, where people are continuing the conversation by highlighting and responding to this story.