cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Neo4j-admin import can be run multiple times?

skmami
Node Clone

Greetings,

Does import tool allow me to load different sections of the graph one at a time ? I want to load few nodes and relationships at a time to build the entire graph. Is that possible ?

Thanks
Satish

8 REPLIES 8

jggomez
Graph Voyager

I think that is not possible. Why do you need that?? What do you want to do?

Thanks

It is slowing down after loading few nodes. I have split my data files into thirty files with each 1million records. It slows down half way through whether I have 30 files, 20 file or even 1 file.

How to debug this. ?

Import starting 2020-05-14 23:15:09.255-0500
  Estimated number of nodes: 6.00 M
  Estimated number of node properties: 31.00 M
  Estimated number of relationships: 5.00 M
  Estimated number of relationship properties: 0.00 
  Estimated disk space usage: 794.5MiB
  Estimated required memory usage: 1.070GiB

(1/4) Node import 2020-05-14 23:15:09.297-0500
  Estimated number of nodes: 6.00 M
  Estimated disk space usage: 632.4MiB
  Estimated required memory usage: 1.070GiB
.......... .......... .......... .......... ..........   5% ∆1s 75ms
.......... .......... .......... .......... ..........  10% ∆204ms
.......... .......... .......... .......... ..........  15% ∆402ms
.......... .......... .......... .......... ..........  20% ∆201ms
.......... .......... .......... .......... ..........  25% ∆400ms
.......... .......... .......... .......... ..........  30% ∆1s 403ms
.......... .........- .......... .......... ..........  35% ∆200ms
.......... .......... .......... .......... ..........  40% ∆1ms
.......... .......... .......... .......... ..........  45% ∆1ms
.......... .......... .......... .......... ..........  50% ∆1s 201ms
.......... .......... .......... .......... ..........  55% ∆4s 804ms
.......... .......... .......... .......... ..........  60% ∆20s 416ms
.......... .......... .......... .......... .......

I have 100Gigs of RAM but it is hardly using any RAM and CPU is also not fully used.

Thanks

skmami
Node Clone
(1/4) Node import 2020-05-14 23:15:09.297-0500
  Estimated number of nodes: 6.00 M
  Estimated disk space usage: 632.4MiB
  Estimated required memory usage: 1.070GiB
.......... .......... .......... .......... ..........   5% ∆1s 75ms
.......... .......... .......... .......... ..........  10% ∆204ms
.......... .......... .......... .......... ..........  15% ∆402ms
.......... .......... .......... .......... ..........  20% ∆201ms
.......... .......... .......... .......... ..........  25% ∆400ms
.......... .......... .......... .......... ..........  30% ∆1s 403ms
.......... .........- .......... .......... ..........  35% ∆200ms
.......... .......... .......... .......... ..........  40% ∆1ms
.......... .......... .......... .......... ..........  45% ∆1ms
.......... .......... .......... .......... ..........  50% ∆1s 201ms
.......... .......... .......... .......... ..........  55% ∆4s 804ms
.......... .......... .......... .......... ..........  60% ∆20s 416ms
.......... .......... .......... .......... ..........  65% ∆10m 16s 392ms
.......... .......... .......... .......... ..........  70% ∆202ms
.......... .......... .......... .......... ..........  75% ∆0ms
.......... .......... .......... .......... ..........  80% ∆0ms
.......... .......... .......... .......... ..........  85% ∆1s 3ms
.......... .......... .......... .......... ..........  90% ∆835ms
.......... .......... .......... .......... ..........  95% ∆801ms
.......... .......... .......... .......... .......... 100% ∆200ms

At 65% it took about 10 minutes. Is there any particular reason why takes so long some times ?

I was confused. You can use "LOAD CSV FROM" and importing large amounts of data

https://neo4j.com/docs/cypher-manual/current/clauses/load-csv/#load-csv-importing-large-amounts-of-d...

Is it useful for you?

Thanks

Thanks @jggomez.

I am loading the data for the first time and I tried LOAD CSV FROM with periodic commit but still was not fast enough.

Since it is loading on an empty database I have the option of import tool which can only be used on a empty database and this seems to avoid transaction layer which makes it faster loading.

But what I am noticing with import is that it goes faster for up to 50 to 60 percent and suddenly slows down.

I tried loading 30 files first, then reduced to 20 files and then to 10 and then to 1. It always stops at 50 to 60 percent regardless of how many number of files I am trying to load.

I have million rows in each file. Is there a suggested number for records that can be in a file ?

Thanks

Hi, I never have loaded million rows from CSV file. You use CREATE instead MERGE. Can I see your code??

Thanks

Thanks @jggomez.

I was using one file with all the columns and I have too many duplicates in the files and I think it is the reason why it is hanging in the middle.

Have you used the import tool ?

Wondering if anyone can comment on why does it print dashes ( _ ) sometimes instead of all dots.
Is there any special meaning for those dashes ?
Thanks

Cannot really help you with the main problem, however I also experienced these dashes on random places during importing. But I never recorded any missing data after the load (I was importing the same data set into multiple databases and running the same queries on all of them).