cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Listing the community size of different community detection algorithms (already calculated)

I have a dataset of 941 persons with the label PERSOON.

For these persons I already calculated the Label propagation, Weakly Connected Components and Louvain.
Of the WCC I made separate computations of a native and cypher projection.
In the end each Person has 4 new properties: lpa, wcc, wcc_cypher and louvain.

The PERSOON node looks like:

{
  "identity": 606,
  "labels": [
    "PERSOON",
  ],
  "properties": {
"louvain": 596,
"neo4jImportId": "18349390",
"wcc": 47,
"lpa": 596,
"naam": "Jansen, J",
"wcc_cypher": 549
  }
}

For every Person node I want to show the different community sizes of each of these properties = calculations on one row as a table.

Question: what is the mos efficient way to query these statistics?

I tried the following query and although they work, the PROFILE shows an big number of rows hit.!

PROFILE MATCH (p:PERSOON)
WITH p
CALL {
    WITH p
    MATCH (l:PERSOON {louvain: p.louvain})
    RETURN count(*) AS louvain
}
CALL {
    WITH p
    MATCH (wcc:PERSOON {wcc: p.wcc})
    RETURN count(*) AS wcc
}
CALL {
    WITH p
    MATCH (wcc_cypher:PERSOON {wcc_cypher: p.wcc_cypher})
    RETURN count(*) AS wcc_cypher
}
CALL {
    WITH p
    MATCH (lpa:PERSOON {lpa: p.lpa})
    RETURN count(*) AS lpa
}
RETURN p.naam, lpa,  wcc, wcc_cypher, louvain

When using 2 MATCH statements and querying 3 properties:

PROFILE MATCH (p:PERSOON)
MATCH (l:PERSOON {louvain: p.louvain}),
(wcc:PERSOON {wcc: p.wcc}),
(wcc_cypher:PERSOON {wcc_cypher: p.wcc_cypher})
RETURN p.naam, count(wcc), count(wcc_cypher), count(l) AS louvain

When using 2 MATCH statements and querying 4 properties:

PROFILE MATCH (p:PERSOON)
MATCH (l:PERSOON {louvain: p.louvain}),
(wcc:PERSOON {wcc: p.wcc}),
(wcc_cypher:PERSOON {wcc_cypher: p.wcc_cypher}),
(lpa:PERSOON {lpa: p.lpa})
RETURN p.naam, count(lpa), count(wcc), count(wcc_cypher), count(l) AS louvain

This query did'nt finish, so no profile.

1 REPLY 1

I would probably compute the community sizes before and then just look them up per person.

MATCH (p:PERSOON)
WITH p.lpa as lpa, count(*) as sizeLpa
WITH apoc.map.fromPairs(collect([toString(lpa), sizeLpa]) as map_lpa
MATCH (p:PERSOON)
WITH map_lpa, p.wcc as wcc, count(*) as sizeWcc
WITH map_lpa, apoc.map.fromPairs(collect([toString(wcc), sizeWcc]) as map_wcc
...
MATCH (p:PERSOON)
RETURN p.name, map_lpa[toString(p.lpa)] as lpaSize,  map_wcc[toString(p.wcc)] as wccSize

added an apoc issue here to make it easier in some future