cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Group by on nodes vs node-properties

intouch_vivek
Graph Steward

Hi,

Scenario:
(n:Person)-[Lends_Money]->(m:Person)
Each Node and relationship has several properties.

Requirement:
I need to know all the properties of both the nodes and relationship and number of times relationship made.

Tried:
Match(n:Person)-[rel:Lends_Money]->(m:Person)
With n.name as ni,m.name as mi, count(rel) as countBenefitTo
Where countBenefitTo>1
Return ni,mi,countBenefitTo

Result.
A B 2
B C 1
A Z 17
This result is half baked. I am unable to get all the other properties of Person and relationship.

When I tried:
Match(n:Person)-[rel:Lends_Money]->(m:Person)
With n as ni,m as mi, count(rel) as countBenefitTo
Where countBenefitTo>1
Return ni,mi,countBenefitTo
It gives no result. although there are similar records as I had copied same data into multiple rows .

Is it possible to group by on nodes and relationships. If yes then how? If no then how to meet the requirement.

Regards
Vivek

5 REPLIES 5

Yes, this is possible, and it is in fact recommended and more performant than grouping on properties, as property access can be more expensive.

When performing node or relationship comparisons (as is used when you use grouping by nodes or relationships) the graph id of the node is used for the comparison, not its properties.

There's also no need to alias the nodes when you do the grouping. This query should work:

Match(n:Person)-[rel:Lends_Money]->(m:Person)
With n, m, count(rel) as countBenefitTo
Where countBenefitTo>1
Return n,m,countBenefitTo

That said, if you're not getting results, that likely means that you have multiple nodes with the same names, so you may not actually have two distinct nodes like this where there are multiple relationships (in that particular direction) going between them.

You might want to double check your data.

I verified my data once again. Infact I just copy pasted a row several time to get the result.
It is working fine for countBenefitTo>0 i.e. least grain of data(it gives me all the records and I verified that those copied rows are also coming with countBenefitTo as 1 . But now working for countBenefitTo>1 (no results)

Provided that the names in your example are what's actually in the graph, you are expecting that there is a node with the name "A" and a node with the name "B" with two outgoing :Lends_Money relationships between them (from A to B).

See what this gives you:

Match (n:Person)-[rel:Lends_Money]->(m:Person)
Where n.name = "A" AND m.name = "B"
Return id(n) as aId, id(m) as bId, count(rel) as count

If the expected pattern and count exists in your graph, you should see a row with a > 1 count.

But if you have other nodes with the same names in your graph, then you may see multiple rows with different ids (meaning different nodes, but with those same names) each having 1 count, but none having more than one.

I think you gave me a clue of id property. Although i had copied the same row multiple times but as there is no merge applied each row is an individual entity so for an example Node with name A has id 1 and other node A will be has id 2 and when I am trying to group by label it is not working as id is an implicit property of the label.
When I tried to group by all the explicit properties of Person I got the good result.

So it sounds like you have a duplicate node problem. You typically do not want to have multiple nodes with identical properties (note that the graph id isn't a property, but an intrinsic identifier that is always unique per node).

You would likely want to ensure you're using MERGE when importing data that might already exist, and you might want to look into cleaning up your graph such that you don't have unnecessary duplicates.