cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Can't LOAD CSV, error about undefined variable or whitespace

I'm trying to LOAD my first CSV file via the Desktop browser.
My first line, LOAD CSV WITH HEADERS FROM file:///T00_ITEM_35.csv AS LINE
(I have back ticks surrounding the file:/// reference but they are not showing in the post)
gives me an error stating that I have an undefined variable and points to the first back tick before the file:/// reference.

I tried single and double quotes (some examples use double quotes for the file) but I then get an error stating Invalid Input ' " ': expected whitespace, comment or an expression.

I have the file located in the Input folder and I have not changed the default Input file location in the settings so I should not need to specify any folders.

What am I doing wrong?

Jason

1 ACCEPTED SOLUTION

Sorry to say but your single quote are not the same .
Could please copy exact line i have mentioned above to use

View solution in original post

20 REPLIES 20

intouch_vivek
Graph Steward

Store your csv at the imports folder. Say your file name is File.csv and in the file you have one column name Employee ID and you want to create a node with Label as Person and its property as Employee ID Number.

LOAD CSV with headers FROM 'file:///File.csv' AS F1
Create (pe:Person {Employee ID Number : F1.Employee ID})

The file is in the Import folder.
Just to be sure, this is what I did.

In the Neo4j Desktop, I clicked "Open folder".
The folder opened.
I dragged the file into the Import folder.

Is there something I'm missing?
Jason

intouch_vivek
Graph Steward

No import is a folder in at the place you have installed your Neo4j Desktop For an example below

F:\Neo4j\ApplicationData\neo4jDatabases\database-*\installation-\import

Yes, that appears to be true.
The Import folder that I'm looking at is in the "installation-3.5.12" folder and is accompanied by folders; bin, certificates, confidential, data, lib, logs, metrics, plugins and run.

I'm using the Mac version of Neo4j Desktop.

Yes you are at the place.

Ok, any thoughts on why I'm getting errors?

Jason

  1. Have you copied the file at the location you have mentioned?
  2. Have you used the cypher query as I had mentioned above?

If yes for both the question then please send the error message

Here is the error (I simplified the file name);
Variable file:///3511.csv not defined (line 1, column 28 (offset: 27))
"LOAD CSV WITH HEADERS FROM file:///3511.csv AS LINE"

I am using back ticks in the code.

please use single quotes
use
LOAD CSV WITH HEADERS FROM 'file:///3511.csv' AS LINE
instead of
"LOAD CSV WITH HEADERS FROM file:///3511.csv AS LINE"

Ok tried single quotes and got an error;

Invalid input '‘': expected whitespace, comment or an expression (line 1, column 28 (offset: 27))
"LOAD CSV WITH HEADERS FROM ‘file:///3511.csv’ AS LINE"

Sorry to say but your single quote are not the same .
Could please copy exact line i have mentioned above to use

That WORKED !!

I know what's happening now. I'm copying code from Word.
I need to be editing this code in an ASCI editor.
I knew it had to be something ridiculously simple!!!

Jason

intouch_vivek
Graph Steward

Great!!

Welcome to Neo4j Community

I may have some more noob questions 🙂
Lets see if I can get my CSV data imported!
Jason

Sure..
Kindly vote it if you are feel good with the solutions

OK. Here's what I'm trying to do.
I have a log file that I have cleaned in Excel and exported to a csv.
I want to CREATE a Person node and fill-in a Name property and a T_ID property.
I also want to create a Relationship called "Connected_To" and create another node called "Course" and fill-in a Name property for it.
The CSV file has 4 columns with headers named; Time, Name, Event, IP_Addr.

Here is my error:
Variable line not defined (line 2, column 25 (offset: 83))
"CREATE (a:Person {Name: line.Name, T_ID:T00} )-[r:Connected_To]->(b:Course {Name: 3511})"

use LINE as the variable name with case sensitive

Data loaded !!!
Awesome Thanks !!!

Jason

I have some logic errors I'm hoping you can help me with.

  1. The "Name" property for ALL of the Person nodes contains "line.Name" instead of the data from the CSV column. It appears it made it a string.

  2. I wanted ONE Person node related to ONE Course node and ALL rows in CSV file (Time, Event and IP_ADDR nodes) related to the ONE Course node

Here is the complete code I'm using:

LOAD CSV WITH HEADERS FROM 'file:///3511.csv' AS LINE
CREATE (a:Person {Name: 'line.Name', T_ID: 'T025'} )-[r:Connected_To]->(b:Course {Name: '3511'})
CREATE (t:Time {DateTime: 'line.Time'})<-[m:Moodle_HIT]-(b)
CREATE (ip:IP_ADDR {IP: 'line.IP_ADDR'})<-[c:Connect_Via]-(t)
CREATE (e:Event {Type: 'line.Event'})<-[p:Performed]-(t)

Hey @BuilderGuy1,

You need to make some small changes (highlighted in bold)

LOAD CSV WITH HEADERS FROM 'file:///3511.csv' AS line
CREATE (a:Person {Name: line.Name, T_ID: 'T025'} )-[r:Connected_To]->(b:Course {Name: '3511'})
CREATE (t:Time {DateTime: line.Time})<-[m:Moodle_HIT]-(b)
CREATE (ip:IP_ADDR {IP: line.IP_ADDR})<-[c:Connect_Via]-(t)
CREATE (e:Event {Type: line.Event})<-[p:Performed]-(t)

Note: When you are loading a CSV file, each "line" in the file becomes the variable line.
So when you want to reference any column header you are now using line.. Oh, and the column headers are case sensitive.

When I load a CSV file, I load the nodes first, and then I go back and load the relationships. It makes it easier to manage for me, but your mileage may vary. You may also consider using MERGE instead of CREATE if there are updates that you are making instead of creating a node each time.
It would look something like this:

// CREATE or UPDATE NODES
// Upsert Person Nodes
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file///3511.csv' AS line
WHERE line.Name IS NOT NULL
MERGE (p:Person {Name: line.Name})
ON CREATE SET
T_ID = 'T025'
;

// Upsert Time Nodes
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file///3511.csv' AS line
WHERE line.Time IS NOT NULL
MERGE (t:Time {DateTime: line.Time})
;

// Upsert IP Address nodes
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file///3511.csv' AS line
WHERE line.IP_ADDR IS NOT NULL
MERGE (ip:IP_ADDR {IP: line.IP_ADDR})
;

// Upsert Event nodes
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file///3511.csv' AS line
WHERE line.Event IS NOT NULL
MERGE (e:Event {Type: line.Event})
;

// CREATE or UPDATE Relationships
// Upsert Person Nodes
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file///3511.csv' AS line
WHERE line.Name IS NOT NULL
AND line.Time IS NOT NULL
AND line.IP_ADDR IS NOT NULL
AND line.Event IS NOT NULL
MATCH (p:Person {Name: line.Name})
MATCH (c:Course {b:Course {Name: '3511'})
MATCH (t:Time {DateTime: line.Time})
MATCH (ip:IP_ADDR {IP: line.IP_ADDR})
MATCH (e:Event {Type: line.Event})
WITH p, c, t, ip, e
MERGE (p)-[:Connected_To]->(c)
MERGE (c)-[:Moodle_HIT]->(t)
MERGE (ip)-[:Connect_Via]->(t)
MERGE (e)-[:Performed]->(t)
;

Finally, do you have any constraints defined for your nodes to ensure uniqueness, as well as creating indices. It will make the loading much faster.

Cheers and welcome,

yyyguy