Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-22-2022 10:30 AM
I am looking for the correct syntax to help me load data into Neo4j, in particular using the periodic commit ability when loading from a python/pandas dataframe. My general workflow is as follows:
In general my functions look like this:
def add_data(df1):
query = """
UNWIND $rows as row
MERGE
SET
RETURN COUNT(*) as total
"""
return conn.query(query, parameters = {'rows':df1.to_dict('records')})
columns = []
df1 = pd.DataFrame(df[columns])
df1 = df1.explode(columns).drop_duplicates()
add_data(df1)
This works great for creating nodes and relationships when the total count is under 1000, but when there are 1M+ nodes/relationships, it tends to not finish.
I know there are server parameters in neo4j.conf that can be adjusted which may help with the load. I know I can save the dataframe to csv and load from harddisk USING PERIODIC COMMIT. I know I can split my dataframe and create a for loop and process the loop from within python. But I don't want to go those routes. I want to get apoc.periodic.commit to work within the add_data function.
I have tried several iterations in attempt to get it to work, but to no avail. I am hoping the community can help.
Thanks in advance.
07-23-2022 01:08 PM
Hi @FourMoBro,
Quick question. How does your Merge statement look? Do you have an index on the properties used?
Regards
All the sessions of the conference are now available online