Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-29-2022 08:45 AM - last edited on 08-01-2022 10:04 AM by TrevorS
Would I need to write a custom user-defined procedure to run a Multinomial Conditional Logistic Regression on neo4j data? I'm trying to think through the best way to run a MCLR on data in my database. Right now I'm thinking that I'll use the ConditionalMNLogit
statsmodel method in python. So I'll query the database in a python script and fit the model with the results of my query. I'm guessing the limitation to this approach would be the amount of data I query from neo4j.
Would anyone have any suggestions? https://www.statsmodels.org/dev/generated/statsmodels.discrete.conditional_models.ConditionalMNLogit...
08-03-2022 01:38 PM
We have a native node classification pipeline that can handle multiple labels. Logistic regression is one of the modeling options. This is more akin to ordinary multinomial logistic regression and is generally geared more towards prediction and machine learning use cases.
You can very well pull data into python using the GDS python client and conduct your modeling there too. If it is helpful, here is a notebook with an example of doing just that, generating features in Neo4j GDS, reading back to python, and using statsmodel logit on the data.
There are multiple other ways to get your data from Neo4j into Python. So if you run into any performance bottlenecks please feel free to reach back out!
All the sessions of the conference are now available online