Neo4j

andreperez · ‎07-02-2021

There is a graph with multiple node labels (ip, device, url, etc) and a "main" node (main_id) which will always have a connection to the previous mentioned nodes (they are all directed from the main_id to the other nodes).

(:ip)<-[:HAS_IP]-(:main_id)-[:USING_DEVICE]->(:device), etc

Some nodes (main_id) will have a specific relationship with a special labelled node (special_id).

(:main_id)-[:HAS_SPECIAL_ID]->(:special_id)

I'm trying to discover other main_id that are similar to the main_id connected to special_ids.

There's no need to have ALL connections the same, so I thought about using different weights to each relationship (ip - 0.6, device - 0.2, etc) and triggering the merge of the relationship between the main_id and the special_id if the value is higher than X (haven't decide this yet, maybe > 0.8).

I thought about using similarity but I'm not sure if it can uses this weight property or even compare multiple relationships.

Is there a better algo to compare this relationships?

alicia_frame1 · ‎07-02-2021

NodeSimilarity seems like the right place to start - it supports weights, and can run over multiple relationship and node types. It will calculate the similarity between source nodes based on the overlap in their target nodes - Node Similarity - Neo4j Graph Data Science

View solution in original post

alicia_frame1 · ‎07-02-2021

NodeSimilarity seems like the right place to start - it supports weights, and can run over multiple relationship and node types. It will calculate the similarity between source nodes based on the overlap in their target nodes - Node Similarity - Neo4j Graph Data Science

andreperez · ‎07-02-2021

I was using it wrong lol.

CALL gds.nodeSimilarity.stream('graph-undirected', {nodeLabels: ['main_id']}) YIELD
  node1,
  node2,
  similarity
  RETURN gds.util.asNode(node1).domain AS node1, gds.util.asNode(node2).domain AS node2, similarity
  ORDER BY similarity DESC

Gave me something really close to what I was looking for. Now I'll apply weight to the relationships and see how it modify my results.
Thank you

Neo4j

Using similarity to find nodes connected to more than one relationship? Or other algo?