Neo4j

andy_hegedus · ‎07-06-2021

Hi,
The result from a simple query shows three groups of green nodes. Group A is only connected to left blue node, Group B is only connected to the right blue node, and Group C is connected to both.

MATCH  p=(c:cpc)-[:Classified_as]-(a:patent{num:9698021})-[:Classified_as]-(n:cpc)-[:Classified_as]-(b:patent{num:7950348})-[:Classified_as]-(x:cpc)
return p

What is the most straightforward way of defining the three group? (Note: the green-green connection for this question)
c variable should contain all nodes connected to the left blue node including those in n
n variable should contain the shared nodes
x variable should contain all connected to the right blue node including those in n.

What I am interested in (think Venn Diagram)
n
(c-n)
(x-n)
Andy

david_allen · ‎07-07-2021

To define these groups node-for-node exactly how you like may be very difficult to do without simply hand-labeling them.

But the general problem space you're in is made for community detection algorithms. I recommend taking a look at this page and playing with these approaches, because taking a big graph and partitioning it into multiple different groups or "communities" is just fundamentally what these algorithms do.

Will they do it exactly as you want? Questionable -- but if you read the algorithm descriptions, you can make the right algorithm choice and probably get very close to what you want on big datasets.

Neo4j

Separating Returned Nodes