cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Count number of Graphs in a DB / match result

As a Graph DB is a collection of Graphs (1-N), is there a simple way to return the number of Graphs? There are many possibilities to count nodes, properties and relationships, but don't seem to be able to find something on counting Graphs itself, and metadata on those Graphs. Eg returning that there are 5 Graphs of 10 nodes and 15 relationships and 5 Graphs of 5 nodes and 7 relationships ... would be a table if there are many Graphs. On a second level, the same counting on a match result - that is, match some condition, and count the number of Graphs of which the resulting nodes are part of.

7 REPLIES 7

ameyasoft
Graph Maven
Assuming you have a node 'Node1'  with a property named 'key' and is connected to multiple nodes say 'Node2',,,,,'NodeX'
 Try this:
match (a:Node1)
match (a)-[r]-(b)
with distinct a.key as pk,"Node1" as name, collect(distinct labels(b)) as lbla, collect (distinct type(r)) as rels
return name, pk, lbla, size(lbla) as cntb, rels, size(rels) as cntr

Result displays pk, Node1 as name, node labels of all nodes in the connected path including Node1,  number of nodes in the path, relationship names and count of distinct relationships.

Indeed, thanks, that gives ingsights for 1-level. What about 2 hops, 3hops ... further?

Maybe a little screendump can help ...

This represents three 'Graphs', that are not connected. The one at the bottom right is one of 5 nodes and 4 relationships. So, would like to return the number of 3 (main objective), 3 lines with for each line the number of nodes and relationships in that Graph (secondary objective).

Bennu
Graph Fellow

Hi @jan.doumen!

I will split this into a coule of queries. Somehow you can avoid some of them but I suggest studying them first individually before trying to collapse some steps.

1 - Create an in memory projection of your graph.

CALL gds.graph.create('myGraph', '*', '*')

2 - Mutate it in order to add a property with the componentId when wcc (weakly connected component) is applied.

CALL gds.wcc.mutate('myGraph', { mutateProperty: 'componentId' })
YIELD nodePropertiesWritten, componentCount;

3 - Then add some magic.

CALL gds.graph.streamNodeProperty('myGraph', 'componentId')
YIELD nodeId, propertyValue as component
with collect(gds.util.asNode(nodeId)) as nodes, component
call apoc.path.subgraphAll(nodes, {

}) YIELD relationships
return component, size(nodes) as countNodes, size(relationships) as countRels

Lemme know if it works as expected in your graph!

Bennu

PS:

This is the one-shot (anonymous) version of it,

CALL gds.wcc.stream({
    nodeProjection : '*',
    relationshipProjection: '*'
})
YIELD nodeId, componentId
with collect(gds.util.asNode(nodeId)) as nodes, componentId
call apoc.path.subgraphAll(nodes, {

}) YIELD relationships
return componentId, size(nodes) as countNodes, size(relationships) as countRels

Thanks ! Works smoothly on a smaller Graph, but haven't waited long enough on a large Graph.

Now, I think this is assuming each starting node is part of a unique component, right? What if some of your starting nodes are part of the same components?

Hi @jan.doumen!

I'm using everyone as starting node so yes, I do include node from same cluster for the relationship count. Can you tell me which step is the one extremely slow?

Bennu

Not sure about steps. The Graph DB contains 70M nodes and 88M relationships, with 23 node types and 37 relationship types ... quite heterogenous ...

Hi @jan.doumen,

Does this query runs in quite decent time?

CALL gds.wcc.stream({
    nodeProjection : '*',
    relationshipProjection: '*'
})
YIELD nodeId, componentId
with collect(gds.util.asNode(nodeId)) as nodes, componentId
Return *

Bennu