cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Problems Importing CSV

john6
Node Link

I am trying to import my first table into Neo4j.

A very simple CSV file (sample):

SpecialtyID,Specialty,SubSpecialty
7001,Accounting
7002,Administrative Services
7003,Adult Education
7025,Consulting,Building and Construction

// Create Specialties
LOAD CSV WITH HEADERS FROM 'https://github.com/johnmillstead/pmg_connect/blob/f90605e306d54ee85c7eceba70f4fc093c8edac4/specialties.csv' AS row
MERGE (specialty:Specialty {specialtyID: row.SpecialtyID})
    ON CREATE SET specialty.specialty = row.Specialty, specialty.subSpecialty = row.SubSpecialty;

results in: Cannot merge the following node because of null property value for 'specialtyID': (:Specialty {specialtyID: null})

1 ACCEPTED SOLUTION

@john6

something is amiss for if I run

LOAD CSV WITH HEADERS FROM 'https://github.com/johnmillstead/pmg_connect/blob/fbec96e04c8209cb922d5b3cd31c1fd5b792acb5/specialties.csv' AS row return row;

which should do nothing more than read the csv from github and return each row the results I get are

╒══════════════════════════════════════════════════════════════════════╕
│"row"                                                                 │
╞══════════════════════════════════════════════════════════════════════╡
│{"<!DOCTYPE html>":"<html lang="en" data-color-mode="auto" data-light-│
│theme="light" data-dark-theme="dark">"}                               │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"  <head>"}                                        │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"    <meta charset="utf-8">"}                      │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"  <link rel="dns-prefetch" href="https://github.gi│
│thubassets.com">"}                                                    │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"  <link rel="dns-prefetch" href="https://avatars.g│
│ithubusercontent.com">"}                                              │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"  <link rel="dns-prefetch" href="https://github-cl│
│oud.s3.amazonaws.com">"}                                              │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"  <link rel="dns-prefetch" href="https://user-imag│
│es.githubusercontent.com/">"}                                         │
....
......
.........

which doesnt appear at all like the csv at pmg_connect/specialties.csv at f90605e306d54ee85c7eceba70f4fc093c8edac4 · johnmillstead/pmg_connect ....

If you run the same LOAD CSV do yo get similar results?
I think the issue is you have the wrong URL for if you run

LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/johnmillstead/pmg_connect/f90605e306d54ee85c7eceba70f4fc093c8edac4/specialties.csv' AS row return row.SpecialtyID, row.Specialty, row.SubSpecialty;

note the different URL, namely https://raw......... then now I get more expected results and as

╒═════════════════╤════════════════════════════════════════════╤═════════════════════════════════════╕
│"row.SpecialtyID"│"row.Specialty"                             │"row.SubSpecialty"                   │
╞═════════════════╪════════════════════════════════════════════╪═════════════════════════════════════╡
│"7001"           │"Accounting"                                │null                                 │
├─────────────────┼────────────────────────────────────────────┼─────────────────────────────────────┤
│"7002"           │"Administrative Services"                   │null                                 │
├─────────────────┼────────────────────────────────────────────┼─────────────────────────────────────┤
│"7003"           │"Adult Education"                           │null                                 │
├─────────────────┼────────────────────────────────────────────┼─────────────────────────────────────┤
│"7004"           │"Appropriate Technologies"                  │null                                 │
├─────────────────┼────────────────────────────────────────────┼─────────────────────────────────────┤
│"7005"           │"Art"                                       │null                           
....
.......
...........

and to which running

LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/johnmillstead/pmg_connect/f90605e306d54ee85c7eceba70f4fc093c8edac4/specialties.csv' AS row MERGE (specialty:Specialty {specialtyID: row.SpecialtyID})
    ON CREATE SET specialty.specialty = row.Specialty, specialty.subSpecialty = row.SubSpecialty;

results in

Added 149 labels, created 149 nodes, set 324 properties, completed after 136 ms.

View solution in original post

5 REPLIES 5

ameyasoft
Graph Maven
Try this:

MERGE (specialty:Specialty {specialtyID: toInteger(row.SpecialtyID)})
    ON CREATE SET specialty.specialty = COALESCE(row.Specialty, 'NA'), specialty.subSpecialty = COALESCE(row.SubSpecialty, 'NA');

john6
Node Link

Thanks. But that didn't work for some reason. So I simplified the dataset to make things easier. Here is what I am working with:

SpecialtyID,Specialty
7001,Accounting
7005,Art
7012,Blogging
7018,Coaching
7041,Editing
7042,Encouragement
7043,Engineering
7044,ESL
7045,Evaluation

I am using this cyper:

// Create Specialties
LOAD CSV WITH HEADERS FROM 'https://github.com/johnmillstead/pmg_connect/blob/fbec96e04c8209cb922d5b3cd31c1fd5b792acb5/specialties.csv' AS row
MERGE (specialty:Specialty {specialtyID: toInteger(row.SpecialtyID)})
    ON CREATE SET specialty.specialty = row.Specialty;

I have tried it with the coalesce and without. I have tried it with the toInteger and without.

I so desperately want to win at this game and finish the very simple project I am tasked with. Thanks for any help you can provide.

John

Hi John

In case the csv format is likely to change over a period of time then it is better to encode them as procedures and include them as libraries on your own installation.

Many thanks
Mr Sameer Sudhir G

@john6

something is amiss for if I run

LOAD CSV WITH HEADERS FROM 'https://github.com/johnmillstead/pmg_connect/blob/fbec96e04c8209cb922d5b3cd31c1fd5b792acb5/specialties.csv' AS row return row;

which should do nothing more than read the csv from github and return each row the results I get are

╒══════════════════════════════════════════════════════════════════════╕
│"row"                                                                 │
╞══════════════════════════════════════════════════════════════════════╡
│{"<!DOCTYPE html>":"<html lang="en" data-color-mode="auto" data-light-│
│theme="light" data-dark-theme="dark">"}                               │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"  <head>"}                                        │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"    <meta charset="utf-8">"}                      │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"  <link rel="dns-prefetch" href="https://github.gi│
│thubassets.com">"}                                                    │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"  <link rel="dns-prefetch" href="https://avatars.g│
│ithubusercontent.com">"}                                              │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"  <link rel="dns-prefetch" href="https://github-cl│
│oud.s3.amazonaws.com">"}                                              │
├──────────────────────────────────────────────────────────────────────┤
│{"<!DOCTYPE html>":"  <link rel="dns-prefetch" href="https://user-imag│
│es.githubusercontent.com/">"}                                         │
....
......
.........

which doesnt appear at all like the csv at pmg_connect/specialties.csv at f90605e306d54ee85c7eceba70f4fc093c8edac4 · johnmillstead/pmg_connect ....

If you run the same LOAD CSV do yo get similar results?
I think the issue is you have the wrong URL for if you run

LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/johnmillstead/pmg_connect/f90605e306d54ee85c7eceba70f4fc093c8edac4/specialties.csv' AS row return row.SpecialtyID, row.Specialty, row.SubSpecialty;

note the different URL, namely https://raw......... then now I get more expected results and as

╒═════════════════╤════════════════════════════════════════════╤═════════════════════════════════════╕
│"row.SpecialtyID"│"row.Specialty"                             │"row.SubSpecialty"                   │
╞═════════════════╪════════════════════════════════════════════╪═════════════════════════════════════╡
│"7001"           │"Accounting"                                │null                                 │
├─────────────────┼────────────────────────────────────────────┼─────────────────────────────────────┤
│"7002"           │"Administrative Services"                   │null                                 │
├─────────────────┼────────────────────────────────────────────┼─────────────────────────────────────┤
│"7003"           │"Adult Education"                           │null                                 │
├─────────────────┼────────────────────────────────────────────┼─────────────────────────────────────┤
│"7004"           │"Appropriate Technologies"                  │null                                 │
├─────────────────┼────────────────────────────────────────────┼─────────────────────────────────────┤
│"7005"           │"Art"                                       │null                           
....
.......
...........

and to which running

LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/johnmillstead/pmg_connect/f90605e306d54ee85c7eceba70f4fc093c8edac4/specialties.csv' AS row MERGE (specialty:Specialty {specialtyID: row.SpecialtyID})
    ON CREATE SET specialty.specialty = row.Specialty, specialty.subSpecialty = row.SubSpecialty;

results in

Added 149 labels, created 149 nodes, set 324 properties, completed after 136 ms.

Thank you so much! I appreciate it. I learned something valuable.