cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Importing a csv file with a column that has a string with csv values

I have a CSV file that contains a userid, an email, names and finally a contacted field.

The contacted field shows other users that the user has contacted, but in the case of multiple users it contains a list where each user that has been contacted is seperated by a ','.

For instance, I have the entry :
1,noaddress@fake.com,fakename,"4,5,6,7"
which means user with userid 1 contacted users with id 4, 5, 6 and 7.

How can I import this to neo4j while creating a 'contacted' relationship between 1 and 4,5,6,7 ?

I have very little experience with neo4j and cypher, and have thus been struggling with this quite a bit and am not finding the documentation particularly helpful.

Any help is appreciated.

1 ACCEPTED SOLUTION

ameyasoft
Graph Maven
Assuming a csv file like this:

ID,email,type,contacted
1,noaddress@fake.com,fakename,"2,3,4"
2,noaddress2@fake.com,fakename,"3.4"
3,noaddress3@fake.com,fakename,"1.4"
4,noaddress4@fake.com,fakename,"1,2,3"

Step 1: Use LOAD CSV to  create all users
 MERGE (a:User {id: toInteger(row.ID), email: row.email, type: row.type, contacted: row.contacted)

Step 2: Run LOAD CSV to create the relationships

MATCH (a:User) where a.id = toInteger(row.id)

with split(a.contacted, ",") as s1, a
unwind s1 as s11
match (b:User) where b.id in toInteger(s11)
merge (a)-[:CONTACTED]->(b) 

View solution in original post

5 REPLIES 5

ameyasoft
Graph Maven
Assuming a csv file like this:

ID,email,type,contacted
1,noaddress@fake.com,fakename,"2,3,4"
2,noaddress2@fake.com,fakename,"3.4"
3,noaddress3@fake.com,fakename,"1.4"
4,noaddress4@fake.com,fakename,"1,2,3"

Step 1: Use LOAD CSV to  create all users
 MERGE (a:User {id: toInteger(row.ID), email: row.email, type: row.type, contacted: row.contacted)

Step 2: Run LOAD CSV to create the relationships

MATCH (a:User) where a.id = toInteger(row.id)

with split(a.contacted, ",") as s1, a
unwind s1 as s11
match (b:User) where b.id in toInteger(s11)
merge (a)-[:CONTACTED]->(b) 

Hi, I am getting a Type mismatch: expected List but was Integer error on

match (b:User") where b.id in toInteger(s11)

My input is exactly as how you assumed the csv to be, headers at the top with the fields, and the contacted field that is a large string with each contacted user seperated by a ","
Any ideas on how I can resolve this issue?

Try this:
LOAD CSV WITH HEADERS FROM "file:/w1.csv" AS row

merge (a:User {id: toInteger(row.ID), email: row.email, type: row.type, contacted: row.contacted})

2. 
match (a:User) where a.id = 1
with a.contacted as s1, a
with split(s1, ",") as s11, a
unwind s11 as s12
with collect(toInteger(s12)) as s13, a
match (b:User) where b.id in s13
merge (a)-[:CONTACTED]->(b) 
return a, b

Result:
3X_4_b_4b251ea2b75297196882f5e44a3002defa73076a.png

Hi, so the new answer is working perfectly, but it is only importing the first users contacted list (from the "where a.id = 1" part). How would I go about adding the relationship for each user? would just removing that part be sufficient? I tried it but stopped it after about 10 minutes of running, I might have been too impatient since it is a massive list with around 5k users and on average 150 contacted users per user.

Sorry the code I sent was to show how to solve your problem. 
Try this:

USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "file:/w1.csv" AS row

match (a:User) where a.id = toInteger(row.ID)
with a.contacted as s1, a
with split(s1, ",") as s11, a
unwind s11 as s12
with collect(toInteger(s12)) as s13, a
match (b:User) where b.id in s13
merge (a)-[:CONTACTED]->(b) 

Result:
3X_6_d_6d9554bd5e2435199280ca4d79ac43446b8a5d84.png