Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-07-2022 01:15 AM
Hello everyone,
I am working on huge graph over 200M nodes and 500M edges. I will add a new node, and approximately 100M nodes will be connected with new node. I am working on with neo4j community edition. It takes too long time. I cannot paralelize because of deadlock exceptions. I got an idea that create mirror nodes such as newnode1, newnode2, newnode3, newnode4, newnode5 ... Create edges in paralel way such as for batch1 -> newnode1, batch2 ->newnode2, batch3 -> newnode3... Then use apoc.refactor.mergeNodes method for merging temporary new nodes into final new node. Is it logical ? What are the pros and cons ?
Thanks.
01-07-2022 03:50 AM
Sorry to disappoint but I would recommend you take a different approach. Yes it's going to be hard to write a single node with 100M nodes connected to it. If you succeed, you're going to have a different problem after that, because you will have created a supernode.
I would recommend not doing what you are trying to do. It is likely you need to choose a different data model that doesn't require a single node attached to 100M other things. In other words, I think the import problem you're running into and the query problems you would have afterwards are symptoms of a needed model change.
For much more information, see this article:
All the sessions of the conference are now available online