Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-01-2019 07:06 AM
Hello,
I want to import 2 separate graphs into Neo4j and then to find the similarity between them. How to do that?
I had a look at the Jaccard Similarity procedure and it seems that the 2 graphs should be connected in some way.
But what if they are not?
Also, I have found that one of the algorithms used for graph similarity/ isomorphism is Maximum Common Subgraph. Is there a way to use it in Neo4j?
Thanks in advance.
12-01-2019 07:31 AM
You need to share some elements between the two graph for example if you're talking about customers and e-costumers there are some properties that share like gender, zip, contact number. etc
maybe you could share a little more of info about the graphs and the model that you want to do.
kind regards, Roberto
12-02-2019 04:44 PM
Thanks Roberto for the reply and for the document.
My goal is to have two separate graphs (and I know previously that there similar ) and I want to calculate the overall similarity or change that happens from one to another.
Example:
Hope that my idea is clear.
Thanks in advance.
Ahmed.
12-02-2019 07:56 AM
Can you explain a bit more about what you mean by similarity between 2 graphs?
Those OOTB similarity algorithms (e.g., jaccard) are meant to be used to find similarity between nodes. You might be able to use them for other purposes though.
12-02-2019 04:49 PM
Thanks shan for your reply.
By the similarity between 2 graphs: can be thoughts also as is I have two identical standalone graphs and then I tweaked some nodes/edges here and there. So, I want to measure how similar this new (tweaked) version of the graph to the original one.
I have also mentioned an example in my second reply above.
Hope this will help in clarifying my idea.
Thanks in advance.
Ahmed
12-03-2019 12:01 AM
If you want to compare the degree of overlap in relationship sequences, then you could check out the BLAST algorithm from bioinformatics (https://en.wikipedia.org/wiki/BLAST_(biotechnology))
You can also represent your graphs as contingency matrices and compare those using a singular value decomposition of the contingency matrices for example. That doesn't take into account the relationship sequences however.
Yet another idea is to do a graph 2 vec (https://neo4j.com/blog/deepwalk-implementing-graph-embeddings-in-neo4j/). This allows you to compare nodes based on their contextual information (i.e. relationship neighborhood similarity).
Just some ideas...
12-03-2019 12:55 AM
Great ideas @niclasko
Thank you very much.
I will try them and will put my feedback here.
12-04-2019 08:01 AM
A new graph algorithm was released yesterday that might help you with your usecase:
https://neo4j.com/docs/graph-algorithms/3.5/algorithms/node-similarity/
The Node Similarity algorithm compares a set of nodes based on the nodes they are connected to. Two nodes are considered similar if they share many of the same neighbors. Node Similarity computes pair-wise similarities based on the Jaccard metric.
12-05-2019 02:01 AM
Use geometric deep learning to correctly classify sub-graph patterns.
https://www.dgl.ai/pages/about.html
https://docs.dgl.ai/tutorials/basics/4_batch.html
Once classified, you just search for any sort of subgraph pattern and you are effectively done.
Beyond that, you can perform link/relation prediction with DGL to figure out which molecule is most likely to change state e.g. changing one H node to C node int order to bind to another molecule depending on its context.
https://docs.dgl.ai/tutorials/basics/1_first.html
In case you are short on structures to classify, you can just generate molecules structures similar but slightly different to your target structure by using GAN's.
https://pubs.rsc.org/en/content/articlelanding/2019/me/c9me00039a#!divAbstract
http://ceur-ws.org/Vol-2350/paper18.pdf
12-06-2019 08:50 AM
Awesome!
Very useful resources and ideas.
Thank you very much marvin-hansen.
I will back with my feedback after trying them.
12-09-2019 08:14 AM
@ahmedmelmoselhy - if you're working in the chemistry space, check out the neo4j/rdkit integration project: https://github.com/rdkit/neo4j-rdkit
That should let you do structure/substructure and similarity searches using RDKit's framework.
Neo4j doesn't ship any out-of-the-box graph isomorphism or graph edit distance algorithms, but the RDKit tools should get you where you need to be
12-10-2019 07:19 AM
okay.
I am trying the other mentioned techniques and may use it as well.
thanks alicia.frame
I may also try to implement some common subgraph algorithm and hope it will work.
12-14-2019 07:14 PM
Also you could try put the properties of the atoms in the relationships and use it as a weights. And use something like page rank
All the sessions of the conference are now available online