cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Get a given node's related distinct node count

coolyrat
Node Clone

I need help on this problem, is there a way to get all related users given a start user? I search and read through the docs and the best match is using (u)-[*1..3]-(t:User). But the result is not what I want, can someone help me?

I started with a query like:

MATCH (u:User {uid: $uid}),
(u)-[*1..3]-(t:User)
RETURN count(DISTINCT t)

It ends up a huge number that doesn't match.

So I narrow it down into 3 queries:

// a match the 1 hop, the two way FOLLOWING relationships
MATCH (u:User {uid: $uid})--(t:User)
return count(distinct t)
// b match the 2 hops, start User's to Post relationships
MATCH (u:User {uid: $uid})-->(:Post)<--(t:User)
return count(distinct t)
// c match the 3 hops, the chat post relationships
MATCH (u:User {uid: $uid})-->(:Chat)<--(:Post)<--(t:User)
return count(distinct t)

Then I don't know how to get distinct count with a,b,c's target users.

1 ACCEPTED SOLUTION

Ah, so I goofed in that suggested query (specifically on which node variable we're checking for the :User label), that's why you're getting bad answers. Give this a try:

MATCH path = (u:User {uid: $uid})-[*1..3]-(t:User) 
WHERE single(node in tail(nodes(path)) WHERE node:User)
RETURN count(DISTINCT t)

View solution in original post

5 REPLIES 5

Copying my answer from the other thread where you asked about this, the variable length count() query you tried is correct, that is the number of distinct reachable :User nodes from the given user.

If the numbers are not what you expect, then you need to be more specific about your requirements and restrictions and ensure those are added to your query.

For example, the three queries you gave later on have restrictions on labels of nodes in the path that were not a part of your earlier variable-length query.

And if you need an expansion to stop when a :User node is found, rather than traversing past it to find other reachable user nodes, then you also need to add that restriction to the query, that won't be done for you:

MATCH path = (u:User {uid: $uid})-[*1..3]-(t:User) 
WHERE single(node in tail(nodes(path)) WHERE t:User)
RETURN count(DISTINCT t)

The query is a bit complex to me. And i tried to verify with those three queries, it doesn't seem right.

the three queries result:
╒═══════════════════╕
│"count(distinct t)"│
╞═══════════════════╡
│5150               │
├───────────────────┤
│12544              │
├───────────────────┤
│717017             │
└───────────────────┘

given query:
╒═══════════════════╕
│"count(DISTINCT t)"│
╞═══════════════════╡
│5150               │
└───────────────────┘

It seems the given query only match the first query's result.

Are you able to share your graph or provide a sample graph creation query so we can test it out and see what's going on? Also, just in interest of sanity checking, can you edit the above post to provide each of the queries run paired with their results?

Ah, so I goofed in that suggested query (specifically on which node variable we're checking for the :User label), that's why you're getting bad answers. Give this a try:

MATCH path = (u:User {uid: $uid})-[*1..3]-(t:User) 
WHERE single(node in tail(nodes(path)) WHERE node:User)
RETURN count(DISTINCT t)

Thank you very much, it works like a charm!