Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-18-2019 01:33 PM
I noticed that the triangleCount algorithm sometimes throws ArrayIndexOutOfBoundsException. This seems to be fixed in 3.5.13.0 and for that reason I moved to 3.5.13. But then noticed that now jaccard similarity algorithm sometimes throws ArithmeticException. I was going to look at the code and see if I can understand why that happens but looks like the githup repo does not have the code for 3.5.13. The latest version in the repo is 3.5.4. Does that mean the source code of the newer versions is not open?
Solved! Go to Solution.
01-24-2020 09:21 AM
@shan - thanks for the feedback! I've added it to our backlog so we keep track of it when we talk about enhancements
WRT your first question, we just open sourced the code for the graph data science library, ahead of our preview release in February: https://github.com/neo4j/graph-data-science. It's still a work in progress, but if you want to see the underlying code or open issues etc, this will be the place for it.
12-19-2019 01:48 AM
Can you create a GH issue for the exceptions you see?
We are working on making the code available again, it's currently undergoing some internal restructuring / modifications.
12-19-2019 07:38 AM
@michael.hunger Are algo.nodeSimilarity
and algo.similarity.jaccard
using the same libraries behind the scene? I am thinking maybe if I use algo.nodeSimilarity
instead of algo.similarity.jaccard
it may not give me that ArithmeticException anymore.
12-20-2019 01:38 AM
Hi Shan!
We're in the process of moving the labs code into a product supported library, which should be released in the next month or two. We're deprecating Jaccard in favor of nodeSimilarity
which uses the Jaccard similarity scoring function, but is a much more performant implementation
Look for open sourced code in the next few weeks as we get ready for a major release - I'll post on the forums as soon as it's available!
12-20-2019 07:34 AM
Hi Alicia,
Thanks for your reply. I am glad to hear lab graph algorithms are going to be officially supported. Thanks for letting us know.
Looking forward to the release.
Seyed
01-21-2020 10:59 AM
Hi Alicia,
I noticed that nodeSimilarity
does not support sourceId
and targetId
whereas jaccardSimilarity
does. Is there any workaround for that?
Thanks,
Seyed
01-22-2020 11:04 AM
@shan - when using a cypher projection? The syntax is source/target, eg:
CALL algo.nodeSimilarity.stream(
'MATCH(n) WHERE n:Person OR n:ItemType RETURN id(n) as id',
'MATCH (p:Person)-[:PURCHASED]->(e:Item)-[:INSTANCE_OF]->(m:ItemType) RETURN id(n) as source, id(m) as target',
{graph:'cypher', direction:'outgoing'})
If you're looking for something equivalent to the sourceIds
and targetId
parameters, where you could pass a vector specifying which you want to compare, we don't explicitly support that input in nodeSimilarity. You'll want to specify the node labels for source and target either directly or via the cypher loader.
Hope that helps!
01-23-2020 07:50 AM
Thanks very much @alicia.frame.
Yes I am using cypher projection and I meant sourceIds
and targetIds
.
Just as a feedback, the good thing about having those parameters is that sometimes you have a graph, you find similarity between nodes, then add some new nodes/edges to your graph, and now you want to only calculate similarity between the newly added nodes and the old ones. Recalculating all those similarities every time a new node is added to the graph could be inefficient if you have a large graph.
As another difference between the new nodeSimilarity
and the old jaccardSimilarity
, the former adds two edges between every pair of nodes (a-->b
and a<--b
) whereas the latter was smart enough to just add one edge. Adding two same similarity edges with the same score that are different only in their directions does not carry that much information.
01-24-2020 09:21 AM
@shan - thanks for the feedback! I've added it to our backlog so we keep track of it when we talk about enhancements
WRT your first question, we just open sourced the code for the graph data science library, ahead of our preview release in February: https://github.com/neo4j/graph-data-science. It's still a work in progress, but if you want to see the underlying code or open issues etc, this will be the place for it.
01-24-2020 10:30 AM
That's awesome. Thanks a lot @alicia.frame
Looking forward to the its official release
All the sessions of the conference are now available online