cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

About db hits of Neo4j

Hi all,
I am quite confused about DB hits of Neo4j.
I have 2 following queries:

Query 1:

profile match (com:Company)<-[:IS_CUSTOMER]-(cust:Customer)
with cust
return sum(cust.sysid)

Query 2:

profile match (com:Company)<-[:IS_CUSTOMER]-(cust:Customer)
with cust
with 
	case when cust.createdYear=2017 then cust.sysid else 0 end as year_2017,
	case when cust.createdYear=2018 then cust.sysid else 0 end as year_2018,
        case when cust.createdYear>2018 then cust.sysid else 0 end as current_year
return sum(year_2017) as year_2017, sum(year_2018) as year_2018, sum(current_year) as present

For query 1: there are 15 total DB hits.
For query 2: there are 27 total DB hits.

As I understand, DB hits stand for the work of storage when I try to get data from the Neo4j database. So in the case of query 2, all customer's nodes are returned and used in the second line of query. It means that data on the storage is already retrieved.

Due to the execution plan, in the projection stage, there are 16 db hits, and this point makes me confused. If all customer's nodes are already retrieved from the database, why db hits are still procedure? Those CASE statements work with the returned data only, they don't need to get data from storage then process later.

2 REPLIES 2

When we work with nodes within Cypher, we use a lightweight object to represent the node with minimal information (such as the graph id of the node), as the query may not need to access properties of that node at all, and because property access can at times be expensive. So property access is lazy. When the MATCH finishes we have not accessed the node's properties, that will happen when properties are actually being used, such as in your CASE.

Hi Andrew,
Thank you so much for your reply.
So in this case, I can group the desired properties then I can use them later without creating new db hits:

profile match (com:Company)<-[:IS_CUSTOMER]-(cust:Customer)
with {createdYear: cust.createdYear, sysid: cust.sysid} as cust
with 
	case when cust.createdYear=2017 then cust.sysid else 0 end as year_2017,
	case when cust.createdYear=2018 then cust.sysid else 0 end as year_2018,
        case when cust.createdYear>2018 then cust.sysid else 0 end as current_year
return sum(year_2017) as year_2017, sum(year_2018) as year_2018, sum(current_year) as present

In this case, total db hits are 19. But for this approaching method, does it consume more memory than the previous methods?
If it does, on which case we should use the previous query and this query?