cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Cypher query to measure connectedness across labels

In order to set some team targets I want a simple metric (comprehensible to stakeholders unfamiliar with graph tech) that measures the connectedness of my graph.

My first sweep has the following query

MATCH (a)
WITH count(a) AS allrecords
MATCH (b)
WHERE ((b)-[]->()) 
RETURN count(b) * 100.0 / allrecords

This measures the proportion of nodes that have an outgoing relationship. Not a bad start but has 2 failings:

  • if a few nodes tend to have a lot of outgoing relationships, about each one has a lot of those, this underestimates the connectedness
  • I won't go into detail why, but for us relationships between differently labelled nodes are the most valuable, so I'd like to ignore relationships between nodes with the same label

I have come up with the following:

MATCH (b)-[]-(c) 
WHERE labels(b)[0] <> labels(c)[0]  
WITH DISTINCT b
WITH count(b) as connected
MATCH (a)
RETURN connected * 100.0 / count(a)

Which aims to get all records connected to at least one record not of the same label in any direction ( all nodes in my graph only have a single label )

Does this look right? I'm getting 62% as the value, whereas my first query got 35%. Intuitively, 35% feels more like what I'd expect

0 REPLIES 0
Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online