cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Why example do not use just SUM but mostly use REDUCE/COLLECT

Hi all, I'm a newbie

I am learning at

https://guides.neo4j.com/sandbox/recommendations/index.html

At session: Collaborative Filtering – Similarity Metrics - Cosine Distance

I see the guide is using a Cypher syntax:

 MATCH (p1:User {name: "Cynthia Freeman"})-[x:RATED]->(m:Movie)<-[y:RATED]-(p2:User)
WITH COUNT(m) AS numbermovies, SUM(x.rating * y.rating) AS xyDotProduct,
SQRT(REDUCE(xDot = 0.0, a IN COLLECT(x.rating) | xDot + a^2)) AS xLength,
SQRT(REDUCE(yDot = 0.0, b IN COLLECT(y.rating) | yDot + b^2)) AS yLength,
p1, p2 WHERE numbermovies > 10
RETURN p1.name, p2.name, xLength, yLength, xyDotProduct / (xLength * yLength) AS sim
ORDER BY sim DESC LIMIT 100;

And I tested by another one, just remove REDUCE/COLLECT and use SUM

MATCH (p1:User {name: "Cynthia Freeman"})-[x:RATED]->(m:Movie)<-[y:RATED]-(p2:User)
WITH COUNT(m) AS numbermovies, SUM(x.rating * y.rating) AS xyDotProduct,
SQRT(SUM(x.rating^2)) AS xLength,
SQRT(SUM(y.rating^2)) AS yLength,
p1, p2 WHERE numbermovies > 10
RETURN p1.name, p2.name, xLength, yLength, xyDotProduct / (xLength * yLength) AS sim
ORDER BY sim DESC LIMIT 100;

I compared two results and they are same.

So I am very confused why the guide use complex above syntax bot just simple with SUM.

1 REPLY 1

If you want to simplify it even more you can check out graph algorithms, where cosine similarity is exposed as a procedure. Check the docs for more: https://neo4j.com/docs/graph-algorithms/current/algorithms/similarity-cosine/

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online