Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-04-2021 04:18 PM
Hi everyone.
I have this subgraph and I would like to count the number of isolated clusters.
I have tried using the weakly connected components algorithm from the GDS library but it returns a high number of components. I don't know if I am doing a mistake in the cypher projection. Basically, I want to return a subgraph of transactions between entities.
CALL gds.graph.create.cypher(
'my-cypher-graph',
'MATCH path = (t1:Payment)-[:HOP*3..10]->(t2:Payment)
WHERE (t1)<-[:MAKES]-(:User)<-[:AT]-(t2)
AND datetime(t1.createdAt)<datetime(t2.createdAt)
AND (((toFloat(t1.amount)-toFloat(t2.amount))/toFloat(t1.amount))<0.25)
AND datetime(t1.createdAt) > datetime("2020-01-01")
AND datetime(t2.createdAt) > datetime("2020-01-01")
UNWIND nodes(path) as t MATCH (e)-[:MAKES]->(t)
WHERE e:Client OR e:Commerce WITH t LIMIT 500
RETURN distinct id(t) as id',
'MATCH (t:Payment)-[:HOP]->(t2:Payment)
WHERE (t)<-[:MAKES]-(:User)<-[:AT]-(t2)
AND datetime(t.createdAt)<datetime(t2.createdAt)
AND (((toFloat(t.amount)-toFloat(t2.amount))/toFloat(t.amount))<0.25)
AND datetime(t.createdAt) > datetime("2020-01-01")
AND datetime(t2.createdAt) > datetime("2020-01-01")
RETURN distinct id(t) AS source, id(t2) AS target', {validateRelationships:false})
I thought that using the GDS library could be the approach but I am open to a different solution
Thanks!
Solved! Go to Solution.
02-09-2021 07:04 AM
Hi,
I think something along these lines will give you the group count for a subgraph (query output), I mark isolated graphs across the entire graph, (more on that later), but here is a quick sketch that I think will give you the simple count for the number groups in a query/subgraph, I did some testing on in one of my graphs... The componentCount value is what you are looking for I think.
call gds.wcc.stats(
{
nodeQuery: 'match (n:EnzymeClass) return id(n) as id',
relationshipQuery:'MATCH (a:EnzymeClass)-->(b:EnzymeClass) RETURN id(a) as source, id(b) as target'
}
)
YIELD componentCount,
createMillis,
computeMillis,
postProcessingMillis,
componentDistribution,
configuration
Background. I always want to keep a careful eye on this aspect in the complete graph, so I mark nodes with a group id, like this.
call gds.wcc.write(
{
nodeQuery: 'match (n) return id(n) as id',
relationshipQuery:'MATCH (a)-->(b) RETURN id(a) as source, id(b) as target',
writeProperty:'group',
consecutiveIds:true
}
)
YIELD nodePropertiesWritten
return nodePropertiesWritten;
Then to determine the number of isolated clusters
match (n)
return max(n.group)
and the follow up question I'm curious about is, what do the clusters look like? are most of the nodes in one group? So I may also run a few follow up queries like
match (n)
return n.group, count(n) as group_size
order by group_size desc
limit 50
02-09-2021 07:04 AM
Hi,
I think something along these lines will give you the group count for a subgraph (query output), I mark isolated graphs across the entire graph, (more on that later), but here is a quick sketch that I think will give you the simple count for the number groups in a query/subgraph, I did some testing on in one of my graphs... The componentCount value is what you are looking for I think.
call gds.wcc.stats(
{
nodeQuery: 'match (n:EnzymeClass) return id(n) as id',
relationshipQuery:'MATCH (a:EnzymeClass)-->(b:EnzymeClass) RETURN id(a) as source, id(b) as target'
}
)
YIELD componentCount,
createMillis,
computeMillis,
postProcessingMillis,
componentDistribution,
configuration
Background. I always want to keep a careful eye on this aspect in the complete graph, so I mark nodes with a group id, like this.
call gds.wcc.write(
{
nodeQuery: 'match (n) return id(n) as id',
relationshipQuery:'MATCH (a)-->(b) RETURN id(a) as source, id(b) as target',
writeProperty:'group',
consecutiveIds:true
}
)
YIELD nodePropertiesWritten
return nodePropertiesWritten;
Then to determine the number of isolated clusters
match (n)
return max(n.group)
and the follow up question I'm curious about is, what do the clusters look like? are most of the nodes in one group? So I may also run a few follow up queries like
match (n)
return n.group, count(n) as group_size
order by group_size desc
limit 50
All the sessions of the conference are now available online