Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-09-2018 01:10 AM
Hi all, I'm a newbie
I am learning at
https://guides.neo4j.com/sandbox/recommendations/index.html
At session: Collaborative Filtering – Similarity Metrics - Cosine Distance
I see the guide is using a Cypher syntax:
MATCH (p1:User {name: "Cynthia Freeman"})-[x:RATED]->(m:Movie)<-[y:RATED]-(p2:User)
WITH COUNT(m) AS numbermovies, SUM(x.rating * y.rating) AS xyDotProduct,
SQRT(REDUCE(xDot = 0.0, a IN COLLECT(x.rating) | xDot + a^2)) AS xLength,
SQRT(REDUCE(yDot = 0.0, b IN COLLECT(y.rating) | yDot + b^2)) AS yLength,
p1, p2 WHERE numbermovies > 10
RETURN p1.name, p2.name, xLength, yLength, xyDotProduct / (xLength * yLength) AS sim
ORDER BY sim DESC LIMIT 100;
And I tested by another one, just remove REDUCE/COLLECT and use SUM
MATCH (p1:User {name: "Cynthia Freeman"})-[x:RATED]->(m:Movie)<-[y:RATED]-(p2:User)
WITH COUNT(m) AS numbermovies, SUM(x.rating * y.rating) AS xyDotProduct,
SQRT(SUM(x.rating^2)) AS xLength,
SQRT(SUM(y.rating^2)) AS yLength,
p1, p2 WHERE numbermovies > 10
RETURN p1.name, p2.name, xLength, yLength, xyDotProduct / (xLength * yLength) AS sim
ORDER BY sim DESC LIMIT 100;
I compared two results and they are same.
So I am very confused why the guide use complex above syntax bot just simple with SUM.
11-09-2018 02:17 AM
If you want to simplify it even more you can check out graph algorithms, where cosine similarity is exposed as a procedure. Check the docs for more: https://neo4j.com/docs/graph-algorithms/current/algorithms/similarity-cosine/
All the sessions of the conference are now available online