cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Fetching relationship count between two specific node types using countstore

I want to fetch the count of a specific relations between two specific node types i.e.

MATCH (n:Person)-[r:Wrote]->(a:Movie)
RETURN count(r)

however this query does not use count store.

I have to figure out such count for all the relationships in my graph and there can be same relationship name between different node types so results coming from using:

CALL apoc.meta.stats is not solving the purpose here

Is there's anyway i can figure out this using countstore

4 REPLIES 4

Hi @himanshu.kapoor ,

I'm not sure I know exactly what you are looking for, however this article does a really good job of detailing the count and countstore functions.

As mentioned in the linked article, the counts store does not store relationship information with respect to labels on both the start and end nodes, only with respect to one.

If a :Person can write other things aside from movies, then you can't use the counts store to get that count with respect to a a :Person node's :WROTE relationships.

If a :Movie can only be written by a :Person, then maybe that count would be useful, and you could get the count from the counts store. Otherwise, if some other node can write a movie, then the counts store won't be able to supply what you're looking for.

Hi @andrew.bowman , @himanshu.kapoor

I'm a little confused and maybe need a little clarification. Is the issue with using the below query that an incorrect result is given or only that the counts store is not used to produce the results? From the question 'I want to fetch the count of a specific relations between two specific node types' it seems this would do the trick?

MATCH (n:Person)-[r:Wrote]->(a:Movie)
RETURN count(r)

Sure, the query you wrote will give a correct answer, even though it won't use the counts store, as the counts for that pattern are not cached within it. It will just have to begin with a label scan on n or a, expand, filter on the label for the other node, and count the number of relationships that pass.

At this time there's no way around that, this is what needs to be done to get the correct answer.

As to why the counts store doesn't track patterns with labels of both start and end nodes, that has to do with the work that would be needed to be done to keep the counts store transactionally consistent as labels are added or removed.

Currently, all the info needed for a counts store update is present on the node itself: its labels, and its relationships (type and direction). Based on that, any addition of a label or removal of a label just needs to find the store related to entries with that label and the corresponding relationships (type/direction) on the node. Nothing beyond the node needs to be looked at, and the node is already locked in order to apply the label changes anyway.

But if we were storing counts store information with labels on both nodes, extra work would be needed in order to update the counts store. We would need to expand all relationships to all connected nodes, lock those nodes, and get their labels, in order to update the appropriate entries. So each label change would now include extra work proportional to the number of relationships on each node, and the additional locking could introduce far more lock contention between the currently executing queries.