Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-05-2019 03:51 PM
In Graph_Algorithms_Neo4j book, I am not finding the Similarity Algorithms like Jaccard topic in the Index page. Am I missing something? Please let me know.
Thanks
Solved! Go to Solution.
06-05-2019 04:31 PM
I don't believe these are covered in the book. I believe the explanation was that although they can be useful in a graph context, they aren't really graph algorithms (they work on vector inputs). This omission is mentioned in the Other Algorithms section:
Other Algorithms
Many algorithms can be used with graph data. In this book, we’ve focused on those that are most representative of classic graph algorithms and those of most use to application developers. Some algorithms, such as coloring and heuristics, have been omitted because they are either of more interest in academic cases or can be easily derived.
Other algorithms, such as edge-based community detection, are interesting but have yet to be implemented in Neo4j or Apache Spark. We expect the list of graph algo‐ rithms used in both platforms to increase as the use of graph analytics grows.
There are also categories of algorithms that are used with graphs but aren’t strictly graphy in nature. For example, we looked at a few algorithms used in the context of machine learning in Chapter 8. Another area of note is similarity algorithms, which are often applied to recommendations and link prediction. Similarity algorithms work out which nodes most resemble each other by using various methods to com‐ pare items like node attributes.
So there's not really anything special about Jaccard or other similarity algorithms in a graph vs any other usage, I think, you just match on two nodes and use property vectors for those nodes in the similarity algorithm. The only really graphy thing about it is the similarity relationships you would create between the nodes being compared (provided you want to do so...it would probably be best to only add those relationships when they're above a certain similarity threshold), and subsequent querying for similar nodes using the relationships that were added as a consequence of running the similarity algorithms previously.
Or, more succinctly:
More info in our graph algo documentation, and here's the section for Jaccard similarity
06-05-2019 04:31 PM
I don't believe these are covered in the book. I believe the explanation was that although they can be useful in a graph context, they aren't really graph algorithms (they work on vector inputs). This omission is mentioned in the Other Algorithms section:
Other Algorithms
Many algorithms can be used with graph data. In this book, we’ve focused on those that are most representative of classic graph algorithms and those of most use to application developers. Some algorithms, such as coloring and heuristics, have been omitted because they are either of more interest in academic cases or can be easily derived.
Other algorithms, such as edge-based community detection, are interesting but have yet to be implemented in Neo4j or Apache Spark. We expect the list of graph algo‐ rithms used in both platforms to increase as the use of graph analytics grows.
There are also categories of algorithms that are used with graphs but aren’t strictly graphy in nature. For example, we looked at a few algorithms used in the context of machine learning in Chapter 8. Another area of note is similarity algorithms, which are often applied to recommendations and link prediction. Similarity algorithms work out which nodes most resemble each other by using various methods to com‐ pare items like node attributes.
So there's not really anything special about Jaccard or other similarity algorithms in a graph vs any other usage, I think, you just match on two nodes and use property vectors for those nodes in the similarity algorithm. The only really graphy thing about it is the similarity relationships you would create between the nodes being compared (provided you want to do so...it would probably be best to only add those relationships when they're above a certain similarity threshold), and subsequent querying for similar nodes using the relationships that were added as a consequence of running the similarity algorithms previously.
Or, more succinctly:
More info in our graph algo documentation, and here's the section for Jaccard similarity
06-05-2019 10:20 PM
Thanks for your reply.
All the sessions of the conference are now available online