Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
09-11-2020 02:12 AM
Hi all,
I am new to cypher and neo4j. We have a neo4j version 3.5.14 . We have articles as node and it's meta data as properties. I am trying to apply some graph algorithms on the nodes to create similar link between articles based on cosine similarity between articles embeddings property. I am able to do it on browser but through cypher query i am having issues. Here is my query -:
session.run(
"""
MATCH (a:article)-[r_0:has]-(k:keyword)-[q:has]-(b:article)
WHERE a._id = {article_id}
AND b.published > {window_left}
AND b.published < {window_right}
AND b.embeddings IS NOT NULL"
WITH a,b
algo.similarity.cosine(a.embeddings,b.embeddings) as similarity
WHERE similarity > 0.8
MERGE (a)-[r:similar]-(b)
""" % (article['_id'],similarity_threshold,window_left,window_right)
)
We have keywords as nodes as well, since data volume is above 100k, I have the first article id and I am trying to first match articles on common keyword and then compare their embeddings. If the embeddings are above 0.8, i create a similar link between those and store their weights.
And is there is any better way to do it? I am able to run this in browser client but not through neo bolt from python
Versions -:
Neo4j -3.5.14-enterprise
neobolt - 1.7.4
Solved! Go to Solution.
09-14-2020 12:35 AM
I think you are missing a comma (",") after "WITH a,b", meaning "WITH a,b,algo..."
09-11-2020 09:59 AM
But, what is the error ?
09-13-2020 10:07 PM
Hi gabriel,
this is the error
09-14-2020 12:35 AM
I think you are missing a comma (",") after "WITH a,b", meaning "WITH a,b,algo..."
09-14-2020 03:53 AM
Hi gabriel, thank you. Kinda embarrassed to be honest.
Is there any way to optimize the query? I see, as my data increases, it is getting slower. Would using cosine.stream would make a difference in speed?
Thank you
09-14-2020 04:49 AM
It's always better to think about ways to optimise your graph, as to re-use already done calculatios, but, have you tried runnig your query in parallel ?
Do take a look here, it's an interesting discussion: How best to do parallel processing
All the sessions of the conference are now available online