Neo4j

armensanoyan · ‎07-26-2021

I am importing csv file from import directory by cypher and I would like to create nodes with labels from csv file. Which would look something like this

LOAD CSV WITH HEADERS FROM "file:///B.csv" AS csv
CREATE (c:csv.Type {name:csv.Name})
return c

I know that this is wrong, but hope you can show me the right way to do it.

dana_canzano · ‎07-26-2021

if you have APOC installed this should be possible via

load csv with headers from 'file:///B.csv' as row 
call apoc.create.node([row.label],{name: row.name}) 
yield node return count(node);

and for example if B.csv has the following content

label,id,name
P1,1,Dana
P2,2,Armen

then the result is

neo4j@neo4j> match (n:P1) return n;
+----------------------+
| n                    |
+----------------------+
| (:P1 {name: "Dana"}) |
+----------------------+

1 row available after 6 ms, consumed after another 2 ms
neo4j@dana> match (n:P2) return n;
+-----------------------+
| n                     |
+-----------------------+
| (:P2 {name: "Armen"}) |
+-----------------------+

View solution in original post

dana_canzano · ‎07-26-2021

if you have APOC installed this should be possible via

load csv with headers from 'file:///B.csv' as row 
call apoc.create.node([row.label],{name: row.name}) 
yield node return count(node);

and for example if B.csv has the following content

label,id,name
P1,1,Dana
P2,2,Armen

then the result is

neo4j@neo4j> match (n:P1) return n;
+----------------------+
| n                    |
+----------------------+
| (:P1 {name: "Dana"}) |
+----------------------+

1 row available after 6 ms, consumed after another 2 ms
neo4j@dana> match (n:P2) return n;
+-----------------------+
| n                     |
+-----------------------+
| (:P2 {name: "Armen"}) |
+-----------------------+

armensanoyan · ‎07-26-2021

Thx. Seems like apoc is more useful, then build in functions.

dana_canzano · ‎07-26-2021

for simple CREATE statements as part of LOAD CSV this is acceptable. Part of the reason that labels can not be parameterized is because the Neo4j query planner would struggle to find the correct plan on a more complex query. If you query was for example

match (n:$param1), ( n2:  $param2) where n.id=n2.id create (n)-[:FOLLOWS]->(n2);

then there is no way the planner would know that for the 1st row in the csv $param1 had a value of ':Person' and $param2 had a value of ':Actor' and because of this the most efficient plan is X but whereas for row2 of the csv $param1 has a value of :Person but $param2 has a value of `:SportsStar' and thus now the most efficient plan is plan Y. And what about if there is an index on :Person(id) but not one on :SportsStar(id). And for these reasons Neo4j does not allow passing lables via parameteres etc

armensanoyan · ‎07-26-2021

Thx for detailed explanation!

Neo4j

Get node label name from csv file