Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-09-2020 07:52 AM
I'm currently needing to executing large queries of the form:
UNWIND $data as row
Merge (n:Node {id: row.id, name: row.name, ...})
I have millions of data rows I need to do this for, but I can't just do this with a single UNWIND
query because neo4j crashes with a memory error. If I make batches so that $data
contains around 20,000 rows at a time, this seems to be OK. But is there a way to increase this? Are there any tricks for dealing with these kinds of situations?
06-09-2020 09:35 AM
could be many more efficient solutions.
This scenario is like migrating data.
Here one can think in simple way.
At the source node level maintain a attribute as migrated =false
And at very first stage select only those record where migrated is false and LIMIT=20000
then MERGE it.
06-09-2020 12:09 PM
I don't really understand what you're suggesting. Are you telling me to try to avoid merging any nodes that are already present in the graph?
I wonder if there is a some other underlying problem here, since it's taking 20 minutes to merge around 100,000 nodes. Each node has around 10 attributes, one of which is name_id
, for which I have set
CREATE CONSTRAINT ON (n:Node)
ASSERT n.name_id is unique
I see posts about people merging millions of nodes in a few minutes, so I am wondering what I'm doing wrong.
06-09-2020 02:57 PM
Hey Rogie,
Since name_id has a constraint, you can merge on name_id and set the other properties using SET clause.
UNWIND $data as row
MERGE (n:Node {name_id:row.name_id})
SET n.id = row.id
and so on..
Can you try this approach?
06-09-2020 03:23 PM
I just tried that and it took 1 minute to merge 20,000 nodes. Is that normal?
06-09-2020 04:50 PM
Umm...can be made faster. What syntax are you using for creating batches? Maybe try playing with the batch size a bit, say 10,000.
All the sessions of the conference are now available online