cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

List the nodes forming unrelated clusters

masavini
Node Link

Hi everybody!

I have a graph with a few thousand nodes which form unrelated clusters. I'd like to get a list of lists with the nodes with a certain label forming each cluster, regardless of the relation type linking them.

Let's consider this simple example:

 

CREATE
    (:Person {name: "Andrea"})-[:LOVES]->(:Person {name: "Bob"})-[:LIKES]->(:Food {name: "pizza"}),
    (n:Person {name: "Mike"})<-[:KNOWS]-(:Person {name: "Paul"})-[:KNOWS]->(:Person {name: "Jen"})<-[:LOVES]-(n)

 

Screenshot_20230112_114223.png

If I'd like to know which persons are somehow related, then I would get:

 

[
    ["Andrea", "Bob"],
    ["Mike", "Paul", "Jen"]
]

 

Could please help me building such a query? Thanks!

2 ACCEPTED SOLUTIONS

masavini
Node Link

Thank you very much, that is exactly what I was looking for.

So I first projected a new graph:

 

 

 

CALL gds.graph.project(
  'myGraph',
  'Person',
  '*'
)

 

Then I retrieved information about each cluster (aka component) in myGraph:

 

CALL gds.wcc.stream('myGraph')
YIELD nodeId, componentId
RETURN gds.util.asNode(nodeId).name AS name, componentId
ORDER BY componentId, name

╒════════╤═════════════╕
│"name"  │"componentId"│
╞════════╪═════════════╡
│"Andrea"│0            │
├────────┼─────────────┤
│"Bob"   │0            │
├────────┼─────────────┤
│"Jen"   │2            │
├────────┼─────────────┤
│"Mike"  │2            │
├────────┼─────────────┤
│"Paul"  │2            │
└────────┴─────────────┘

 

Could you suggest how to group names inside a list of lists?

View solution in original post

This groups the names by componentId.  Do you want further grouping?

CALL gds.wcc.stream('myGraph')
YIELD nodeId, componentId
RETURN collect(gds.util.asNode(nodeId).name) AS names, componentId
ORDER BY componentId

 

View solution in original post

7 REPLIES 7

cuneyttyler
Ninja
Ninja

You might need to extract connected components, Check https://neo4j.com/docs/graph-data-science/current/algorithms/wcc/

masavini
Node Link

Thank you very much, that is exactly what I was looking for.

So I first projected a new graph:

 

 

 

CALL gds.graph.project(
  'myGraph',
  'Person',
  '*'
)

 

Then I retrieved information about each cluster (aka component) in myGraph:

 

CALL gds.wcc.stream('myGraph')
YIELD nodeId, componentId
RETURN gds.util.asNode(nodeId).name AS name, componentId
ORDER BY componentId, name

╒════════╤═════════════╕
│"name"  │"componentId"│
╞════════╪═════════════╡
│"Andrea"│0            │
├────────┼─────────────┤
│"Bob"   │0            │
├────────┼─────────────┤
│"Jen"   │2            │
├────────┼─────────────┤
│"Mike"  │2            │
├────────┼─────────────┤
│"Paul"  │2            │
└────────┴─────────────┘

 

Could you suggest how to group names inside a list of lists?

Can you give an example about 'list of lists'. What kind of data is this?

in the example above, something like this:

[
    ["Andrea", "Bob"],
    ["Mike", "Paul", "Jen"]
]

a list of components where each component is represented by a list of node names.

In neo4j there is no explicit grouping logic. Group by works with aggregate functions so I will use 'count'. I suppose something similar to this would work for you.

 

WITH [
    ["Andrea", "Bob", "Andrea"],
    ["Mike", "Paul", "Jen","Bob"]
] as myList UNWIND myList as subList UNWIND subList as name return name, count(name)

 

returns

 

"Andrea"	2
"Bob"	2
"Mike"	1
"Paul"	1
"Jen"	1

 

This groups the names by componentId.  Do you want further grouping?

CALL gds.wcc.stream('myGraph')
YIELD nodeId, componentId
RETURN collect(gds.util.asNode(nodeId).name) AS names, componentId
ORDER BY componentId

 

masavini
Node Link

what can i say? OUTSTANDING job, guys. never seen such a good support before, congratulations.