Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-06-2021 12:39 PM
Hello,
I need to apply linear regression machine learning method on Neo4j already created database to find node importance and correlation.
Any examples to learn from?
Thanks.
06-07-2021 05:46 AM
If you're looking at measuring node importance probably a good place to start are the centrality algorithms - Centrality . They use the structure of the graph itself to measure the importance of nodes.
Similarly, the Similarity algorithms - Neo4j Graph Data Science algorithms, use standard metrics for correlation (jaccard, cosine, pearson, etc) to measure the similarity of nodes.
GDS doesn't currently offer node regression, but there are some apoc procedures you can take a look at.
06-07-2021 08:59 AM
Thanks a lot for the info @alicia.frame1 , apoc procedures are great start, specially apoc.math.regr(). However, it is missing a lot of regression parameters. There were a great effort done by Lauren Shin in https://towardsdatascience.com/graphs-and-ml-multiple-linear-regression-c6920a1f2e70
However, it doesn't work in neo4j 4.X. It would be great if it is integrated there
06-08-2021 02:59 AM
Thanks for your reply, maybe that's why I couldn't apply the regression example on Neo4j, because I'm using the latest version. Does it mean if I change the version I will be able to use regression functions?
Is there any example about this topic using Python connected to the Neo4j graph?
06-08-2021 09:50 AM
Check out the tutorials under our developer guide pages, here: Link Prediction with GDSL and scikit-learn - Developer Guides
We walk through using Neo4j with scitkit learn, sagemaker, and training models inside neo4j.
06-09-2021 10:48 AM
If you use neo4j 3.5.x you can use this package https://towardsdatascience.com/graphs-and-ml-multiple-linear-regression-c6920a1f2e70
All the sessions of the conference are now available online