cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Find the largest clusters of a database

Hello, I'm new with neo4j and cypher 

I have a database of transactions between persons, I load the data from a CSV file



LOAD CSV WITH HEADERS FROM 'file:///low.csv' AS line FIELDTERMINATOR ','
merge (O:Ordenante {nombre: line.NOMBRE_COMPLETO_ORDENANTE})
merge (B:Beneficiario {nombre: line.NOMBRE_COMPLETO_BENEFICIARIO})
CREATE (O)-[R:Envió]->(B)
SET O.estado_de_ordenante = line.ESTADO_ORDENANTE
SET B.estado_de_beneficiario = line.ESTADO_BENEFICIARIO
SET B.estado_de_apertura = line.SUC_APERTURA
SET R.monto_enviado = line.MONTO_EN_PESOS

I create two nodes, and one relation, the graph looks like this:

Captura.PNG

I want to return only the clusters of nodes with 5 or more nodes 

Captura2.PNGI'm using this query, but I get all the nodes. What am I doing wrong?

 

match (O)-[R]->(B)
with count(O) as mnt   
where mnt > 2
match (O)-[R]->(B)
return R, O, B
1 REPLY 1

When you use an aggregate function without a grouping term, you are counting all the rows.  In your case, the 'count(O)' counts all the paths returned. If you want to know how many B nodes are related to each O node, then you need something like the following:

match (O)-[R]->(B)
with O, count(O) as mnt   
where mnt > 2
match (O)-[R]->(B)
return R, O, B

If you want to avoid matching after the filtering to get 'R, O, and B' back, you can try something like the following:

match (O)-[R]->(B)
with O, collect({R:R, B:B}) as items, count(O) as mnt
where mnt > 2
unwind items as item
return O, item.R, item.B

 

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online