cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Network graph clustering

Hello

I have a large-scale network consisting of 62,578 nodes. It represents the network topology of an SDN network (Software Defined Networking). 

- I want to partition the graph into a number of clusters, each cluster should be controlled by an SDN controller. 

- I tried to use the k-means algorithm, but it doesn't take into account the relationships between the nodes. This algorithm relies on the nodes and their properties. 

- Then I tried the similarity algorithm, but it calculates the similarity score between 2 nodes and creates a new relationship that holds this value between these 2 nodes. As a result, I couldn't benefit from this way in k-means algorithm. 

- Louvain and Leiden don't allow the number of clusters in advance. Is there any way to do that with these algorithms? 

Suggestions Please. Any idea may be of great help for me. Many thanks. 

1 ACCEPTED SOLUTION
7 REPLIES 7

Hi , it would be good if you share the database schema , but you could try :

1.- try to create the features property for you node including the sdnID .. for example If you have :
(Node)-[:CONTROLLED_BY]->(SDN)
include the sdnID for each Node and other properties in one called "features" then provide this to the Kmean algorithm

2.- the latest version of bloom have some GDS Algorithms , try to invesigate with it.

 

 

 

samasamaan_0-1667136128750.pngsamasamaan_1-1667136178174.png

Thanks for your reply.

The graph is homogeneous. No SDN nodes are their. The 1st picture is a part of the whole graph. 

Hello @samasamaan 🙂

Are there any disconnected subgraphs? If yes, then you should have a look at Weakly Connected Components (WCC) algorithm.

Regards,
Cobra

Hello @Cobra 

No there are no disconnected subgraphs. No islands!  It is a network topology that consists of nodes (switches and hosts). 

Any suggestion please.

 

Can the double relations be replaced by a single relation or do they have an importance?

Here they don't have importance. I used the double relations for path-finding algorithms. 

What about using FastRP for node embeddings and then using these embeddings in the K-means algorithm? 

If you like to take a look at this  Suggestion in stackoverflow.

Yeah it's a good solution 🙂