cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

LOAD CSV efficiency

mdfrenchman
Graph Voyager

If I have a CSV file that has say 100,000 records and I have this query:

LOAD CSV WITH HEADERS FROM 'http://myUrlPath/file1.csv' AS row
RETURN * LIMIT 5

Under the hood does LOAD CSV open the file and get ALL 100,000 rows of data or does it only open then read lines until limit then close the connection?

Does it behave the same with a file:/// source?

Thanks,
Mike French

1 ACCEPTED SOLUTION

After it has the limited set then the load ends and the connection should close, and it should be the same in both cases.

Just as a side note, make sure you're using USING PERIODIC COMMIT for larger CSVs so your load is batched. That shouldn't affect file access/closure but it will ensure the transactions are batched, which is easier on the heap.

View solution in original post

1 REPLY 1

After it has the limited set then the load ends and the connection should close, and it should be the same in both cases.

Just as a side note, make sure you're using USING PERIODIC COMMIT for larger CSVs so your load is batched. That shouldn't affect file access/closure but it will ensure the transactions are batched, which is easier on the heap.