Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
10-19-2020 12:10 PM
Background:
Neo4j Community edition 4.0.0
APOC 4.0.0.16
GDS 1.3.4
I am looking for information on how to run a similarity analysis between two 'lists' of nodes all at once, rather than one at time.
My schema looks similar to this:
(Node1 {type:'A'})-[:rel1]->(Node2)-[:rel2]->(Node3)-[:rel3]->(Node4)-[:rel4]->(Node5 {name:'xxx'})
(Node1 {type:'B'})-[:rel1]->(Node2)-[:rel2]->(Node3)-[:rel3]->(Node4)-[:rel4]->(Node5 {name:'xxx'})
I can do a one off similarity analysis using gds.alpha.similarity.jaccard
to see how similar the Node4 contents compare. The problem is, I have about 100 different Node1s with type 'A' to compare with about 100 Node1s of type 'B'. I would like to do this as "one" procedure, with the results output to a table to visualize, or possibly saving the results back to the database.
Try to think of this problem as comparing 2 different Bills of Material (Node1) used to manufacture an assembly (Node5) at different revisions.
Can someone please advise? Thanks.
UPDATE Found it here: https://neo4j.com/docs/graph-data-science/current/alpha-algorithms/jaccard/ by Table 5.279. I was missing the WHERE p1 <> p2
clause which was causing the query to run forever.
All the sessions of the conference are now available online