Neo4j

adam_cowley · ‎08-27-2020

Helpful aggregation operations, such as calculating averages, sums, percentiles, minimum/maximum, and counts are available in Cypher. You may find many of these have similar syntax to other query language operations, but Cypher does work slightly differently with aggregation.

In Cypher, you do not need to specify a grouping key. It implicitly groups by a non-aggregate field in the return clause. This might seem much easier than more verbose syntax in other languages, but opinions may vary.

Aggregating by Count

Sometimes you only need to return a count of the results found in the database, rather than returning the objects themselves. The count() function in Cypher allows you to count the number of occurences of entities, relationships, or results returned.

There are two different ways you can count return results from your query. The first is by using count(n) to count the number of occurences of n and does not include null values. You can specify nodes, relationships, or properties within the parentheses for Cypher to count. The second way to count results is with count(*), which counts the number of result rows returned (including those with null values).

In our data set, some of our Person nodes have a Twitter handle, but others do not. If we run the first example query below, you will see that we have the twitter property has a value for four people and is null for the other five people. The second and third queries show how to use the different count options.

//Query1: see the list of Twitter handle values for Person nodes
MATCH (p:Person)
RETURN p.twitter
//Query2: count on the Person nodes (does not include null values)
MATCH (p:Person)
RETURN count(p.twitter)
//Query3: count on the Person nodes
MATCH (p:Person)
RETURN count(*)

Aggregating Values

The collect() function in Cypher gives you the capability to aggregate values into a list. You can use this to group a set of values based on a particular starting node, relationship, property.

For instance, if we listed each person in our example data with each of their friends (see the Cypher below), you would see duplicate names in the left column because each Person might have multiple friends, and you need a result for each relationship from the starting person. To aggregate all of a person’s friends by the starting person, you can use collect(). This will group the friend values by the non-aggregate field (in our case, p.name).

MATCH (p:Person)-[:IS_FRIENDS_WITH]->(friend:Person)
RETURN p.name, collect(friend.name) AS friend

Counting Values in a List

If you have a list of values, you can also find the number of items in that list or calculate the size of an expression using the size() function. The examples below return the number of items or patterns found.

//Query5: find number of items in collected list
MATCH (p:Person)-[:IS_FRIENDS_WITH]->(friend:Person)
RETURN p.name, size(collect(friend.name)) AS numberOfFriends
//Query6: find number of friends who have other friends
MATCH (p:Person)-[:IS_FRIENDS_WITH]->(friend:Person)
WHERE size((friend)-[:IS_FRIENDS_WITH]-(:Person)) > 1
RETURN p.name, collect(friend.name) AS friends, size((friend)-[:IS_FRIENDS_WITH]-(:Person)) AS numberOfFoFs

This is a companion discussion topic for the original entry at https://neo4j.com/developer/cypher/controlling-query-processing/

Neo4j

Controlling Query Processing

Aggregating by Count

Aggregating Values

Counting Values in a List