Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-05-2021 03:47 PM
Hi,
Can I use the official neo4j driver to load data from a pandas dataframe into Neo4j on a daily basis? If not, then can I use the py2neo connector to also efficiently execute cypher queries that create nodes and relationships, and/or delete nodes? According to the py2neo docs, it seems like the py2neo driver is the way to go for me when deciding between these two drivers.
I'm about to start loading data from a pandas dataframe into our neo4j database and py2neo seems to be the way to go based on these stackoverflow questions:
I was just curious to know the experiences of neo4j users who have implemented this python driver approach.
Thanks
Solved! Go to Solution.
03-17-2021 09:52 AM
Thank you for your insights! I've been learning more about the official Python Neo4j driver and I would say that this is surely the way to go. The documentation is pretty good, and here are two useful articles I found on this subject in case anyone is interested:
https://towardsdatascience.com/neo4j-cypher-python-7a919a372be7
https://towardsdatascience.com/create-a-graph-database-in-neo4j-using-python-4172d40f89c4
Thanks to this driver I have scheduled daily updates to my Neo4j database from multiple sources across my company. Here's to hoping that the Neo4j team continues to update this driver.
Thanks!
02-09-2021 11:12 AM
I can't edit my post above any more. I need to add another question here:
Which of these two python drivers is the better and faster approach to load data into Neo4j?
query = """
:auto USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS
FROM 'file:///data.csv' AS row
MERGE(p:Person {id: toInteger(row.id)}
""""
02-09-2021 11:19 AM
I found this Medium's article very interesting, where makes comparisons with the different python Neo4j drivers
But, my experience using the Neo4j's python driver is good, it's easy to use and I didn't found problems with having low data transfer speeds .
02-09-2021 11:22 AM
Thank you! Great article, and this sums it up nicely:
My recommendation? Definitely py2no is not an option . Although it is user-friendly in many respects, it is too slow for counting queries. Neo4jrestclient is not bad, but sometimes it returns nested list structure which we have to deal with using some trick (e.g. “sum(temp,)” which I want to avoid. So I think I would go with the Neo4j Python driver . After all it is the only official release supported by Neo4j. What is your recommendation?
I'll follow up here on this post which driver I ended up using.
Wouldn't it be cool if the official neo4j driver also supported pandas dataframes as a source of data?
02-09-2021 02:52 PM
Pandas is specific, and processing a pandas dataframe in other frameworks maybe not a such optimal as using the power of that library gives to you in python.
So, my answer to this: Not really, because the trend for data types used on webservices and Apps are things like Json (A standard that works everywhere). In other cases, the non-dev usages, the spreadsheets are very common, and the csv appears here, compact and easy to be generated.
06-02-2022 01:35 AM
Checkout this wrapper over the official driver:
https://github.com/GSK-Biostatistics/neointerface#load_df
03-17-2021 09:52 AM
Thank you for your insights! I've been learning more about the official Python Neo4j driver and I would say that this is surely the way to go. The documentation is pretty good, and here are two useful articles I found on this subject in case anyone is interested:
https://towardsdatascience.com/neo4j-cypher-python-7a919a372be7
https://towardsdatascience.com/create-a-graph-database-in-neo4j-using-python-4172d40f89c4
Thanks to this driver I have scheduled daily updates to my Neo4j database from multiple sources across my company. Here's to hoping that the Neo4j team continues to update this driver.
Thanks!
All the sessions of the conference are now available online