Neo4j

tony156 · ‎12-01-2022

Hi, I'm new to neo4j. I'm trying to use knn in gds do calculate similarities. I understand that knn calculates similarities between all pairs of nodes in the graph and find the most similar k nodes. However, what I'm looking for is, for given a node N, I need to find the node in the database that is most similar to N. How can I achieve this goal? Thank you for your help.

glilienfield · ‎12-01-2022

Have you tried to use the gds KNN algorithm? If so, what went wrong?

tony156 · ‎12-01-2022

Yes I used knn but it was taking long (2 minutes) to calculate nearest neighbors for 40000 nodes in my database. What I'm hoping for is to calculate the nearest neighbor for one given node only. I tried to find if knn has such functionality but couldn't find any.

glilienfield · ‎12-01-2022

There looks to be a filtered version of KNN, where you can specify the source and/or target nodes. The filter can be for specific nodes or labels. With this, you should be able to specify your single node in the filtered source nodes, so it finds the the K nearest neighbors for your single node.

it looks to be in alpha state

https://neo4j.com/docs/graph-data-science/current/algorithms/alpha/filtered-knn/

tony156 · ‎12-02-2022

Ah I see thank you. I guess I'd have to wait for next release gds. In the mean time I'll find some other ways.

glilienfield · ‎12-02-2022

As I understand, the alpha and beta versions are accurate. They just may change before being fully promoted to non alpha or beta versions

tony156 · ‎12-02-2022

I just tried the filtered knn algorithm and it worked ! but it takes around 1 minute for only 40,000 nodes compared. Is that common?

glilienfield · ‎12-02-2022

Sorry, I am not a user of GDS, so I can’t comment.

glilienfield · ‎12-02-2022

Are there target nodes you can filter out to speed up the calculation.

tony156 · ‎12-08-2022

Not really I had to search against all the nodes to find the closest in similarity. I have this implementation in SQL database so I thought moving to graph database would speed up but it looks like there's not much improvement.

Neo4j

Find similarity of given node with entire graph