Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-14-2020 02:35 PM
Greetings,
Does import tool allow me to load different sections of the graph one at a time ? I want to load few nodes and relationships at a time to build the entire graph. Is that possible ?
Thanks
Satish
05-14-2020 09:16 PM
I think that is not possible. Why do you need that?? What do you want to do?
Thanks
05-14-2020 09:26 PM
It is slowing down after loading few nodes. I have split my data files into thirty files with each 1million records. It slows down half way through whether I have 30 files, 20 file or even 1 file.
How to debug this. ?
Import starting 2020-05-14 23:15:09.255-0500
Estimated number of nodes: 6.00 M
Estimated number of node properties: 31.00 M
Estimated number of relationships: 5.00 M
Estimated number of relationship properties: 0.00
Estimated disk space usage: 794.5MiB
Estimated required memory usage: 1.070GiB
(1/4) Node import 2020-05-14 23:15:09.297-0500
Estimated number of nodes: 6.00 M
Estimated disk space usage: 632.4MiB
Estimated required memory usage: 1.070GiB
.......... .......... .......... .......... .......... 5% ∆1s 75ms
.......... .......... .......... .......... .......... 10% ∆204ms
.......... .......... .......... .......... .......... 15% ∆402ms
.......... .......... .......... .......... .......... 20% ∆201ms
.......... .......... .......... .......... .......... 25% ∆400ms
.......... .......... .......... .......... .......... 30% ∆1s 403ms
.......... .........- .......... .......... .......... 35% ∆200ms
.......... .......... .......... .......... .......... 40% ∆1ms
.......... .......... .......... .......... .......... 45% ∆1ms
.......... .......... .......... .......... .......... 50% ∆1s 201ms
.......... .......... .......... .......... .......... 55% ∆4s 804ms
.......... .......... .......... .......... .......... 60% ∆20s 416ms
.......... .......... .......... .......... .......
I have 100Gigs of RAM but it is hardly using any RAM and CPU is also not fully used.
Thanks
05-14-2020 09:29 PM
(1/4) Node import 2020-05-14 23:15:09.297-0500
Estimated number of nodes: 6.00 M
Estimated disk space usage: 632.4MiB
Estimated required memory usage: 1.070GiB
.......... .......... .......... .......... .......... 5% ∆1s 75ms
.......... .......... .......... .......... .......... 10% ∆204ms
.......... .......... .......... .......... .......... 15% ∆402ms
.......... .......... .......... .......... .......... 20% ∆201ms
.......... .......... .......... .......... .......... 25% ∆400ms
.......... .......... .......... .......... .......... 30% ∆1s 403ms
.......... .........- .......... .......... .......... 35% ∆200ms
.......... .......... .......... .......... .......... 40% ∆1ms
.......... .......... .......... .......... .......... 45% ∆1ms
.......... .......... .......... .......... .......... 50% ∆1s 201ms
.......... .......... .......... .......... .......... 55% ∆4s 804ms
.......... .......... .......... .......... .......... 60% ∆20s 416ms
.......... .......... .......... .......... .......... 65% ∆10m 16s 392ms
.......... .......... .......... .......... .......... 70% ∆202ms
.......... .......... .......... .......... .......... 75% ∆0ms
.......... .......... .......... .......... .......... 80% ∆0ms
.......... .......... .......... .......... .......... 85% ∆1s 3ms
.......... .......... .......... .......... .......... 90% ∆835ms
.......... .......... .......... .......... .......... 95% ∆801ms
.......... .......... .......... .......... .......... 100% ∆200ms
At 65% it took about 10 minutes. Is there any particular reason why takes so long some times ?
05-14-2020 09:55 PM
I was confused. You can use "LOAD CSV FROM" and importing large amounts of data
Is it useful for you?
Thanks
05-15-2020 06:57 AM
Thanks @jggomez.
I am loading the data for the first time and I tried LOAD CSV FROM with periodic commit but still was not fast enough.
Since it is loading on an empty database I have the option of import tool which can only be used on a empty database and this seems to avoid transaction layer which makes it faster loading.
But what I am noticing with import is that it goes faster for up to 50 to 60 percent and suddenly slows down.
I tried loading 30 files first, then reduced to 20 files and then to 10 and then to 1. It always stops at 50 to 60 percent regardless of how many number of files I am trying to load.
I have million rows in each file. Is there a suggested number for records that can be in a file ?
Thanks
05-15-2020 05:24 PM
Hi, I never have loaded million rows from CSV file. You use CREATE instead MERGE. Can I see your code??
Thanks
05-16-2020 04:42 PM
Thanks @jggomez.
I was using one file with all the columns and I have too many duplicates in the files and I think it is the reason why it is hanging in the middle.
Have you used the import tool ?
Wondering if anyone can comment on why does it print dashes ( _ ) sometimes instead of all dots.
Is there any special meaning for those dashes ?
Thanks
06-14-2020 03:06 PM
Cannot really help you with the main problem, however I also experienced these dashes on random places during importing. But I never recorded any missing data after the load (I was importing the same data set into multiple databases and running the same queries on all of them).
All the sessions of the conference are now available online