cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

algo.nodeSimilarity returns wrong min, max and mean in the result

shan
Graph Buddy

I am going to replace algo.similarity.jaccard with the new algo.nodeSimilarity. In my experiments with nodeSimilarity I noticed that max, min, and mean are all -1.0 in the result. Then I wrote a query to actually calculate those values from the similarity scores written back to the graph and I was successful. So I am not sure why I see those -1s in the result of algo.nodeSimilarity. Here are the params I pass to algo.nodeSimilarity:
{graph: "cypher", write:true, writeRelationshipType: "SIMILAR", similarityCutoff: 0.3, direction:"OUTGOING"}

1 ACCEPTED SOLUTION

@shan

Thank you for pointing this out.
If you add any of the percentile properties (p1 for instance) to the YIELD/RETURN statements the min, max and mean are populated in the result.
We have prepared a fix for this which will be included in the next patch.

View solution in original post

4 REPLIES 4

I tried this using the sample graph in the docs, and wasn't able to reproduce your results:

CREATE (alice:Person {name: 'Alice'})
CREATE (bob:Person {name: 'Bob'})
CREATE (carol:Person {name: 'Carol'})
CREATE (dave:Person {name: 'Dave'})
CREATE (eve:Person {name: 'Eve'})
CREATE (guitar:Instrument {name: 'Guitar'})
CREATE (synth:Instrument {name: 'Synthesizer'})
CREATE (bongos:Instrument {name: 'Bongos'})
CREATE (trumpet:Instrument {name: 'Trumpet'})

CREATE (alice)-[:LIKES]->(guitar)
CREATE (alice)-[:LIKES]->(synth)
CREATE (alice)-[:LIKES]->(bongos)
CREATE (bob)-[:LIKES]->(guitar)
CREATE (bob)-[:LIKES]->(synth)
CREATE (carol)-[:LIKES]->(bongos)
CREATE (dave)-[:LIKES]->(guitar)
CREATE (dave)-[:LIKES]->(synth)
CREATE (dave)-[:LIKES]->(bongos)
CALL algo.nodeSimilarity(
'MATCH (n) WHERE n:Person or n:Instrument RETURN id(n) as id',
'MATCH (n:Person)-[:LIKES]->(m:Instrument) RETURN id(n) as source, id(m) as target', 
{graph: "cypher", write:true, writeRelationshipType: "SIMILAR", similarityCutoff: 0.3, direction:"OUTGOING"}
)

I get min =0.33, max = 1.00, mean=0.6. I tested putting an invalid query in (not specifying both the source and target nodes in the first cypher query, having no matches for the second) and I just get 0. Can you share what version of the graph algorithms library you're using, and any more about your data model, input data, etc?

Thank you @alicia.frame

I think I found out when that problem happens. If I YIELD those values and then RETURN, I get -1.0 for min, max, and mean, but the other values are right:

CALL algo.nodeSimilarity(
'MATCH (n) WHERE n:Person or n:Instrument RETURN id(n) as id',
'MATCH (n:Person)-[:LIKES]->(m:Instrument) RETURN id(n) as source, id(m) as target', 
{graph: "cypher", write:true, writeRelationshipType: "SIMILAR", similarityCutoff: 0.3, direction:"OUTGOING"}) 
YIELD min, max,writeMillis,loadMillis,nodesCompared,relationships 
RETURN min, max,writeMillis,loadMillis,nodesCompared,relationships

Am I missing something?
I am using version 3.5.13.0 of the graph algorithms library.

@shan

Thank you for pointing this out.
If you add any of the percentile properties (p1 for instance) to the YIELD/RETURN statements the min, max and mean are populated in the result.
We have prepared a fix for this which will be included in the next patch.