cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

How to calculate multiple similarity scores and sort them based on the sum of similarity scores?

11li
Node Clone

Hi! 

My graph is like:

11li_0-1674702346062.png

The purple node PATENT is what I want to calculate.

I tried to write a piece of code to calculate and query one of the patents, the jaccard score of Device.

11li_1-1674702463876.png

'''

MATCH (n:Patent{patent_title:'一种机器人建图定位方法'})-[:has]->(c:Device)<-[:has]-(other:Patent)
with n,other,count(c) as intersection,collect(c.name) as collection
match (n)-[:has]->(nc:Device)
with n,other,intersection,collection,collect(nc.name) as s1
match (other)-[:has]->(oc:Device)
with n,other,intersection,collection,s1,collect(oc.name) as s2
with n,other,intersection,s1,s2,[IN s2 where not x IN s1] as s21
with n,other,intersection,s1+s21 as uni,s1,s2
return n.patent_title,other.patent_title,s1,s2,((1.0*intersection)/SIZE(uni)) as jaccard
order by jaccard DESC
limit 20

'''

As you can see in the schema, Patent got some other type nodes linked with. Besides [Device], I want to calculate with [Field]\[Imp]\[Env]\[Algorithm]\[topic00]\[topic01]... and all other types of nodes.

So I was trying to calculate all results to be j1(named jaccard1),j2,j3,j4.... and sum all the jaccard scores to be jsum?

then,

order by jsum DESC

My expression may be a little confused...

I can only calculate one jaccard scores and order by this, I don't know how to calculate all other scores at the same time and order by jsum.

Thank you!

---

running:

Neo4j Browser version: 4.4.3

Neo4j Server version: 4.4.5 (community)

 

5 REPLIES 5

Try this...

MATCH (n:Patent{patent_title:'一种机器人建图定位方法'})-[:has]->(c)<-[:has]-(other:Patent)
with n,other,head(labels(c)) as type, count(c) as intersection
match (n)-[:has]->(nc) where head(labels(nc)) = type
with n,other,type,intersection,collect(nc.name) as s1
match (other)-[:has]->(oc) where head(labels(oc)) = type
with n,other,type,intersection,s1,collect(oc.name) as s2
with n,other,type,intersection,s1,s2,[x IN s2 where not x IN s1] as s21
with n,other,type,intersection,s1+s21 as uni,s1,s2
return n.patent_title,other.patent_title,type,s1,s2,((1.0*intersection)/SIZE(uni)) as jaccard
order by jaccard DESC
limit 20

Note, I assumed each node had just one label.

Thank you for replying!

Yes, each node just have one label.

11li_0-1674795089388.png

I don't know what "by zero" means..

 

 

omg, I find that I gave [Env] label wrong property key!!

11li_0-1674796922943.png

 

I've changed this to be name.

and your code worked.

11li_1-1674801382015.png

I may not express myself clearly.

I want to sum all types of jaccard score. and order by the sum score.

Between two patent nodes, there will be a [device]'s jaccard score, a [algorithm]' jaccard score, a [navtech] score.... then sum all of them.

finally, order by the sum value.