cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Graph Algorithms Online Training- Similarity Algorithm Clarification

I am currently going through the Category Hierarchy - Overall Similarity Algorithm exercise. While I was able to complete the query for similarity algorithm, I could not understand the intuition behind removing the transitive relationships.

As I understood the similarity algorithm, the whole idea of calculating the similarity co-efficient (and having a cutoff of 0.75) was to get the similarity between the categories based on business nodes within those categories. Having done that why do we further need to check whether relationships are adjacent in the hierarchy or not?

Appreciate your help.

Regards,

Ammar

1 REPLY 1

The goal is to build a hierarchy around the categories that we could then use to help users search for things. So if we have say:

Category 1 <- Category 2 <- Category 3
Category 1 <- Category 3

We can remove the direct link between Category 1 and Category 3 since there is already a relationship though Category 2. We could leave the direct link there if we wanted, removing it is mostly to make the hierarchy cleaner.

This technique was suggested by my colleague @jesus.barrasa. You can read more about it in a blog post that he wrote a few years ago - https://jbarrasa.com/2017/03/31/quickgraph5-learning-a-taxonomy-from-your-tagged-data/

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online