Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-08-2020 04:12 AM
Im trying to find similar images using approximate nearest neighbour algorithm using cosine similarity. This is the query:
MATCH (p:Image)
WITH {item:id(p), weights: p.vec} AS userData
WITH collect(userData) AS data
CALL gds.alpha.ml.ann.write({
nodeProjection: '',
relationshipProjection: '',
data: data,
topK:20,
algorithm: 'cosine',
writeRelationshipType:"SIMILAR_APPROX",
similarityCutoff: 0.1,
p:0.5,
maxIterations:50
})
YIELD nodes, similarityPairs, computations
RETURN nodes,
apoc.number.format(similarityPairs) AS similarityPairs,
apoc.number.format(computations) AS computations
But when I search similar images to one specific image, non of the results are from the same category as the first image (dolphin). I have 9119 nodes in my database. Here's the query for searching similar images to one specific image:
MATCH (r:Image) WHERE id(r)=1932
WITH r,
[(r)-[:SIMILAR_APPROX]->(i)| i.path ] AS similarNodes
RETURN similarNodes
input image:
one example of output images:
Am I missing some parameters in algorithm or why am I getting results from other categories when clearly I have more similar images in database?
Thank you in advance!
05-18-2020 03:58 PM
What are you passing to ANN to measure similarity on? The node property in p.vec?
I would check the similarity of the two images using cosine similarity directly https://neo4j.com/docs/graph-data-science/current/alpha-algorithms/cosine/. It's possible that there's something off in your image embedding that's causing the two vectors to be quite similar. The categories you're referencing aren't available to ANN, so it's solely based on the values in p.vec.
You're also returning the top 20 most similar images, with a cutoff of 10%... which could give you some fairly dissimilar images. If you return the similarity scores of the pairs, what are they? And do you get the same value from cosine similarity run over that pair?
All the sessions of the conference are now available online