Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
11-29-2022 09:41 AM - edited 11-29-2022 09:42 AM
This is regarding neo4j csv import using LOAD csv. Suppose my csv file format is as following
Id, OID, name, address, Parents , Children
1, mid1, ratta, hello@aa, ["mid250","mid251","mid253"], ["mid60","mid65"]
2, mid2, butta, ado@bb, ["mid350","mid365","mid320", "mid450","mid700"], ["mid20","mid25","mid30"]
3, mid3, natta, hkk@aa, ["mid50","mid311","mid543"], []
So the parents and children columns consists of mids basically..while importing csv into neo4j using LOAD CSV.. I want to create following nodes and relationships.
NODES for each rows (for each id column in csv)
[:PARENT] relationship by matching the OID property in each row and OID properties inside parents column. So as a example when processing the first row...there should be four nodes (mid1, mid250,mid 251 and mid 253) and 3 PARENT relationship between mid1 and other 3 nodes.
[: CHILD ] relationship by matching the OID property in each row and OID properties inside children column.
Please help!!
Tried doing it with for each function but the results didn't come correctly. Im doing it through a python script. just need to edit the cypher query. The problem here is when creating nodes the OID property returns as mid1 ....whereas when creating relationships OID property comes like this -> ['mid250']. So when creating relationships it creates another duplicate node.
def create_AAA(tx):
tx.run(
"LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (e:AAA {id: row._id,OID: row.OID,address: row.address,name: row.name})"
)
def create_parent(tx):
tx.run(
"LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (a:AAA {OID: row.OID}) FOREACH (t in row.parents | MERGE (e:AAA {OID:t}) MERGE (a)-[:PARENT]->(e) )"
)
def create_child(tx):
tx.run(
"LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (a:AAA {OID: row.OID}) FOREACH (t in row.children | MERGE (e:AAA {OID:t}) MERGE (a)-[:CHILD]->(e) )"
)
with driver.session() as session:
session.write_transaction(create_AAA)
session.write_transaction(create_parent)
session.write_transaction(create_child)
11-29-2022 11:26 AM
You basically want to convert that string => '["mid250", "mid251"]' etc into individual strings without quotes. Have a look at this to get an idea:
with '["mid250","mid251","mid253"]' as str
with substring(str, 1, size(str)-2 ) as strings
unwind split (strings, ",") as parent
return substring(parent, 1, size(parent)-2);
line 2 removes the brackets
line 3 splits into individual quoted strings
line 4 removes the quotes
11-29-2022 11:55 AM
Thank you very much for the quick reply. I was trying to implement your code inside the create_parent function. But was not succesfull. Can you please share how your answer can be implemented within the create_parent function.
11-29-2022 12:26 PM
There's quite a few issues with your code and data (e.g. spaces in the csv file, _id vs Id, etc.
With this data:
This code seems to do what you want:
LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row
with row, substring(row.Parents, 1, size(row.Parents)-2 ) as parents
unwind split(parents, ",") as parent
merge (t:AAA {OID:parent})-[:PARENT]->(:AAA {OID: row.OID});
Id,OID,name,address,Parents,Children
1,mid1,ratta,hello@aa,"[mid2,mid251,mid253]","[mid60,mid65]"
2,mid2,butta,ado@bb,"[mid350,mid365,mid320]","[mid20,mid25,mid30]"
11-29-2022 01:01 PM
Thank you very much for your time. Thers is an issue. It shows the following error. How can we solve this. Thank you again.
ValueError: dictionary update sequence element #0 has length 1; 2 is required
11-29-2022 01:05 PM
That error is coming from Python. Have you tried looking at the stack trace to see what's going on? When I'm doing something like this, I always start small - e.g. develop/test the code in cypher-shell or Browser. Once I have my cypher code, then move on to getting it working in Python.
11-30-2022 10:45 AM
Thank you very much. yeah its a python error. I'll try to figure that out.
11-30-2022 12:08 PM
I don't see anything in your Python code above that would obviously generate this kind of error.
All the sessions of the conference are now available online