Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
03-31-2020 04:59 PM
I am currently going through the Category Hierarchy - Overall Similarity Algorithm exercise. While I was able to complete the query for similarity algorithm, I could not understand the intuition behind removing the transitive relationships.
As I understood the similarity algorithm, the whole idea of calculating the similarity co-efficient (and having a cutoff of 0.75) was to get the similarity between the categories based on business nodes within those categories. Having done that why do we further need to check whether relationships are adjacent in the hierarchy or not?
Appreciate your help.
Regards,
Ammar
08-06-2020 07:57 AM
The goal is to build a hierarchy around the categories that we could then use to help users search for things. So if we have say:
Category 1 <- Category 2 <- Category 3
Category 1 <- Category 3
We can remove the direct link between Category 1 and Category 3 since there is already a relationship though Category 2. We could leave the direct link there if we wanted, removing it is mostly to make the hierarchy cleaner.
This technique was suggested by my colleague @jesus.barrasa. You can read more about it in a blog post that he wrote a few years ago - https://jbarrasa.com/2017/03/31/quickgraph5-learning-a-taxonomy-from-your-tagged-data/
All the sessions of the conference are now available online