Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
08-14-2021 02:24 AM
I have a dataset of 941 persons with the label PERSOON.
For these persons I already calculated the Label propagation, Weakly Connected Components and Louvain.
Of the WCC I made separate computations of a native and cypher projection.
In the end each Person has 4 new properties: lpa, wcc, wcc_cypher and louvain.
The PERSOON node looks like:
{
"identity": 606,
"labels": [
"PERSOON",
],
"properties": {
"louvain": 596,
"neo4jImportId": "18349390",
"wcc": 47,
"lpa": 596,
"naam": "Jansen, J",
"wcc_cypher": 549
}
}
For every Person node I want to show the different community sizes of each of these properties = calculations on one row as a table.
Question: what is the mos efficient way to query these statistics?
I tried the following query and although they work, the PROFILE shows an big number of rows hit.!
PROFILE MATCH (p:PERSOON)
WITH p
CALL {
WITH p
MATCH (l:PERSOON {louvain: p.louvain})
RETURN count(*) AS louvain
}
CALL {
WITH p
MATCH (wcc:PERSOON {wcc: p.wcc})
RETURN count(*) AS wcc
}
CALL {
WITH p
MATCH (wcc_cypher:PERSOON {wcc_cypher: p.wcc_cypher})
RETURN count(*) AS wcc_cypher
}
CALL {
WITH p
MATCH (lpa:PERSOON {lpa: p.lpa})
RETURN count(*) AS lpa
}
RETURN p.naam, lpa, wcc, wcc_cypher, louvain
When using 2 MATCH statements and querying 3 properties:
PROFILE MATCH (p:PERSOON)
MATCH (l:PERSOON {louvain: p.louvain}),
(wcc:PERSOON {wcc: p.wcc}),
(wcc_cypher:PERSOON {wcc_cypher: p.wcc_cypher})
RETURN p.naam, count(wcc), count(wcc_cypher), count(l) AS louvain
When using 2 MATCH statements and querying 4 properties:
PROFILE MATCH (p:PERSOON)
MATCH (l:PERSOON {louvain: p.louvain}),
(wcc:PERSOON {wcc: p.wcc}),
(wcc_cypher:PERSOON {wcc_cypher: p.wcc_cypher}),
(lpa:PERSOON {lpa: p.lpa})
RETURN p.naam, count(lpa), count(wcc), count(wcc_cypher), count(l) AS louvain
This query did'nt finish, so no profile.
08-20-2021 02:27 AM
I would probably compute the community sizes before and then just look them up per person.
MATCH (p:PERSOON)
WITH p.lpa as lpa, count(*) as sizeLpa
WITH apoc.map.fromPairs(collect([toString(lpa), sizeLpa]) as map_lpa
MATCH (p:PERSOON)
WITH map_lpa, p.wcc as wcc, count(*) as sizeWcc
WITH map_lpa, apoc.map.fromPairs(collect([toString(wcc), sizeWcc]) as map_wcc
...
MATCH (p:PERSOON)
RETURN p.name, map_lpa[toString(p.lpa)] as lpaSize, map_wcc[toString(p.wcc)] as wccSize
added an apoc issue here to make it easier in some future
All the sessions of the conference are now available online