Neo4j

samasamaan · ‎10-30-2022

Hello

I have a large-scale network consisting of 62,578 nodes. It represents the network topology of an SDN network (Software Defined Networking).

- I want to partition the graph into a number of clusters, each cluster should be controlled by an SDN controller.

- I tried to use the k-means algorithm, but it doesn't take into account the relationships between the nodes. This algorithm relies on the nodes and their properties.

- Then I tried the similarity algorithm, but it calculates the similarity score between 2 nodes and creates a new relationship that holds this value between these 2 nodes. As a result, I couldn't benefit from this way in k-means algorithm.

- Louvain and Leiden don't allow the number of clusters in advance. Is there any way to do that with these algorithms?

Suggestions Please. Any idea may be of great help for me. Many thanks.

@alicia_frame1

@koji

@michael_hunger

Cobra · ‎10-31-2022

Yeah it's a good solution 🙂

View solution in original post

ponceortiz · ‎10-30-2022

Hi , it would be good if you share the database schema , but you could try :

1.- try to create the features property for you node including the sdnID .. for example If you have :
(Node)-[:CONTROLLED_BY]->(SDN)
include the sdnID for each Node and other properties in one called "features" then provide this to the Kmean algorithm

2.- the latest version of bloom have some GDS Algorithms , try to invesigate with it.

samasamaan · ‎10-30-2022

Thanks for your reply.

The graph is homogeneous. No SDN nodes are their. The 1st picture is a part of the whole graph.

Cobra · ‎10-30-2022

Hello @samasamaan 🙂

Are there any disconnected subgraphs? If yes, then you should have a look at Weakly Connected Components (WCC) algorithm.

Regards,
Cobra

samasamaan · ‎10-30-2022

Hello @Cobra

No there are no disconnected subgraphs. No islands! It is a network topology that consists of nodes (switches and hosts).

Any suggestion please.

Cobra · ‎10-31-2022

Can the double relations be replaced by a single relation or do they have an importance?

samasamaan · ‎10-31-2022

Here they don't have importance. I used the double relations for path-finding algorithms.

What about using FastRP for node embeddings and then using these embeddings in the K-means algorithm?

If you like to take a look at this Suggestion in stackoverflow.

Cobra · ‎10-31-2022

Yeah it's a good solution 🙂

Neo4j

Network graph clustering