cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Load CSV data

AttieV
Node Link

Hi, I previously loaded data from CSV file using Cypher: :auto using Periodic Commit 20000 - with no memory issues. It seems that periodic commit is not available in the new version, replaced by CALL {....} IN TRANSACTIONS OF X ROWS. But now I get an out of memory error, even if I make X very small (10). The dataset is big but always worked with the periodic commit. It now loads about 500 000 nodes and then I get the out of memory error.  I probably need to switch to admin bulk import tool going forward but in the interim would like advice on how to resolve this memory issue. 

Thanks, appreciate the assistance.

1 ACCEPTED SOLUTION

hum....that is the same way I would have written the query. You could try the apoc equivalent to see if it is any better. I set 'parallel' equal to true, since you are only creating nodes there will not be potential locking issues. 

CALL apoc.periodic.iterate(
  'load csv with headers from "file:///nodes.csv" AS clients return clients',
  'CREATE (c:Client 
      {cust_no: TOINTEGER(clients.cust_no),
       name: clients.name}
    )',
  {batchSize:10000, parallel:true}) yield total
return total

 

View solution in original post

4 REPLIES 4

Can you share your code? 

Hi, thanks for the response. This is the code: 

:auto load csv with headers from "file:///nodes.csv" AS clients
CALL{
    WITH clients
    CREATE (c:Client 
      {cust_no: TOINTEGER(clients.cust_no),
       name: clients.name}
    )
} IN TRANSACTIONS OF 100 ROWS

 Thanks

hum....that is the same way I would have written the query. You could try the apoc equivalent to see if it is any better. I set 'parallel' equal to true, since you are only creating nodes there will not be potential locking issues. 

CALL apoc.periodic.iterate(
  'load csv with headers from "file:///nodes.csv" AS clients return clients',
  'CREATE (c:Client 
      {cust_no: TOINTEGER(clients.cust_no),
       name: clients.name}
    )',
  {batchSize:10000, parallel:true}) yield total
return total

 

Thanks! the apocalypse procedure worked.