Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
10-09-2022 07:28 AM
I am importing several TB of CSV data into Neo4J for a project I have been working on. I have enough fast storage for the estimated 6.6TiB, however the machine has only 32GB of memory, and the import tool is suggesting 203GB to complete the import.
When I run the import, I see the following (I assume it exited because it ran out of memory). Is there any way I can import this large dataset with the limited amount of memory I have? Or if not with the limited amount of memory I have, with the maximum ~128GB that the motherboard this machine can support.
Available resources:
Total machine memory: 30.73GiB
Free machine memory: 14.92GiB
Max heap memory : 6.828GiB
Processors: 16
Configured max memory: 21.51GiB
High-IO: true
WARNING: estimated number of nodes 37583174424 may exceed capacity 34359738367 of selected record format
WARNING: 14.62GiB memory may not be sufficient to complete this import. Suggested memory distribution is:
heap size: 5.026GiB
minimum free and available memory excluding heap size: 202.6GiB
Import starting 2022-10-08 19:01:43.942+0000
Estimated number of nodes: 15.14 G
Estimated number of node properties: 97.72 G
Estimated number of relationships: 37.58 G
Estimated number of relationship properties: 0.00
Estimated disk space usage: 6.598TiB
Estimated required memory usage: 202.6GiB
(1/4) Node import 2022-10-08 19:01:43.953+0000
Estimated number of nodes: 15.14 G
Estimated disk space usage: 5.436TiB
Estimated required memory usage: 202.6GiB
.......... .......... .......... .......... .......... 5% ∆1h 38m 2s 867ms
neo4j@79d2b0538617:~/import$
Solved! Go to Solution.
10-10-2022 12:08 PM
10-09-2022 01:24 PM
As a sidenote- if I have to reduce the amount of data I am importing to fit within memory constrains, what will have the biggest impact? Removing nodes, edges, or attributes, something different? Thanks 🙂
10-09-2022 10:53 PM
Can you just partition data into smaller chunks and import separately?
10-10-2022 06:57 AM
I thought I could only use admin import once since it overwrites the graph? Should I look into using another import tool?
10-10-2022 12:08 PM
This is correct *today*
10-10-2022 05:26 PM
This may be a bad idea- but I have added 240GB of SWAP on an SSD (My boot drive, which is probably even more ill-advised, but the NVME zfs pool I have the database on will have a great many writes, and I am cutting it a bit close on storage space with all this as is). I will check tomorrow to see where it's at.
All the sessions of the conference are now available online