cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Node similarity

Hello!

I noticed that neo4j will not render all pair-wise node similarities between nodes. I know that those with 0 connections are not taken into consideration, but for 76.000 nodes, I receive back only 702.718 records. I manually checked for scores between some nodes with common connections and they were absent. Could you please tell me what I should change/do so I get back pair-wise comparisons? Thank you!

6 REPLIES 6

Can you provide the query you are using, or what operations are you executing that is not working for you?

Hello! I am copy-pasting what's in the documentation:

CALL gds.graph.project(
'myGraph',
['Gene', 'Pfam','Panther', 'Ko', 'Go', 'Ec', 'Kog'], ///nodes
['GO', 'PANTHER', 'KO', 'EC', 'KOG', 'PFAM'] ///connection types
);
I am interested in finding similarities between Genes:

CALL gds.nodeSimilarity.stream('myGraph')
YIELD node1, node2, similarity
RETURN gds.util.asNode(node1).name AS Gene1, gds.util.asNode(node2).name AS Gene2, similarity
ORDER BY similarity DESCENDING, Gene1, Gene2

Thank you!

Does every gene node have the same set of outgoing relationships?

Yes! I was investigating scores for some particular genes which are actually identical (as in, they are linked to exact the same nodes - so they should have obtained a similarity score of 1). However, one of the genes does not appear at all in the whole csv file with results. But it exists in the graph, it has the connections, it's just that not all possible comparisons are being reported in the file...

How are you generating the file?

In the neo4j browser I run:

CALL gds.nodeSimilarity.stream('myGraph')
YIELD node1, node2, similarity
RETURN gds.util.asNode(node1).name AS Gene1, gds.util.asNode(node2).name AS Gene2, similarity
ORDER BY similarity DESCENDING, Gene1, Gene2
which outputs the csv file with the similarity score.