Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
10-02-2021 12:35 PM
For example, this is my load query:
CALL apoc.load.csv('test.csv', {nullValues:['','na','NAN',false], sep:' '})
yield map as row
MERGE (m:Test {nid: row.nid})
ON CREATE SET m += row
ON MATCH SET m += row
RETURN count(m) as mcount
My data format could be like this:
nid tag uid date
001 c|python|java 1003252452 20210929
The 'tag' column has 3 values and they should be split by '|'. In apoc.load.csv, is it possible to automatically convert and load three records:
nid tag uid date
001 c 1003252452 20210929
002 python 1003252452 20210929
003 java 1003252452 20210929
In my csv, if I first expand them into 3 lines, my CSV becomes very large. Is that possible?
10-03-2021 07:25 AM
Hi @lingvisa
This is my data.
nid tag uid date
001 c|python|java 1003252452 20210929
002 go|c++|javascript 1234567890 20211001
I wasn't sure if "nid" was simply a sequence, so I created the nodes anyway.
6 nodes have been generated from 2 records.
CALL apoc.load.csv('test.csv', {nullValues:['','na','NAN',false], sep:' '})
yield map as row
WITH row, split(row.tag, '|') AS tags
UNWIND tags AS onetag
CREATE (m:Test {nid: row.nid})
SET m.tag = onetag
SET m.uid = row.uid
SET m.date = row.date
10-05-2021 10:02 AM
My case is a little more complicated than the example I presented. I will try your approach. And I need to compare the overall speed impact between:
The 2nd approach can reduce csv files size a lot, and it may also boost loading speed, but makes the loading code more complicated.
10-05-2021 04:48 PM
Hi @lingvisa
How about "LOAD CSV" instead of "apoc.load.csv"?
LOAD CSV WITH HEADERS FROM 'file:///test.csv' AS row
FIELDTERMINATOR ' '
UNWIND split(row.tag, '|') AS onetag
CREATE (m:Test {nid: row.nid})
SET m.tag = onetag
SET m.uid = row.uid
SET m.date = row.date
All the sessions of the conference are now available online