Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-09-2020 08:53 AM
Hello All,
I am using desktop version to access browser based neo-4j instance.
while i am trying to load data to database using "LOAD CSV", there was an error on one of the rows with column value as "l cells, eosinophils and macrophages in C57BL/6 mice, TCR-δ−"
I am getting below error
@ position 244043317 - there's a field starting with a quote and whereas it ends that quote there seems to be characters in that field after that ending quote. That isn't supported. This is what I read: 'l cells, eosinophils and macrophages in C57BL/6 mice, TCR-δ−",2,97,20050008627-3316T
1503204,20050008627,CELLTYP,CL0000235,macrophage,out3289.xml,18395,17142,macrophages,"ll'
I have used cypher query:
:auto USING PERIODIC COMMIT 100000
LOAD CSV WITH HEADERS FROM 'file:///hits_termite.csv' as line
create (d:Hit {id:line.id,hitTerm:line.hitTerm,
loc:line.loc,
frag:line.frag,
hitID:line.hitID,
docID:line.docID,
entityType:line.entityType,
patent_no:line.doc_number,
sentenceNbr:line.sentenceNbr,
loc_custom:line.loc_custom,
name:line.name,
nonambigsyns:line.nonambigsyns
})
I am unable to understand what the issue is and how to over come the same.
could you please help.
07-09-2020 10:02 AM
Hello @i.varikuti and welcome to the Community!
From what you are describing, your data is not "clean".
You should inspect it to see where the quotes are being misunderstood. For example, you may need to add escaped characters to the field.
Rather than creating nodes from the data initially, I would recommend that you simply return the suspect field values so you can see them. That is the easiest way to see if the data is clean.
Best regards,
Elaine Rosenberg
07-09-2020 10:26 AM
Hello @elaine.rosenberg,
Thanks for your warm welcome and for your insights on the thread.
As you already might know I am new to Neo-4j, after my analysis I understood that the issue is because of the character backslash at end of the string.
Could you please let me know how could I escape this character backslash and would this cause any issue if the sentence is ending with this character?
In addition to this, it would be really useful for me if you can provide me some information(like blogs or documentation) on how to clean the data before loading it to the DB.
Regards,
Indrakaran Varikuti
07-09-2020 12:10 PM
Use single quotes around strings and use the \ to escape the \ character.
For example dlfkgjsdlfgjsddlfg\\
I recommend that you look at the lesson we have on using LOAD CSV in our online course, Introduction to Neo4j 4.0 at https://neo4j.com/graphacademy/online-training
This lesson methodically goes through some steps you can take to examine and clean up/transform data.
Elaine
07-09-2020 01:13 PM
07-09-2020 04:13 PM
First column data seems to be missing starting double quote. Also the last column is in error. I created a .csv file with your data line after making the necessary changes and ran it my local db (version 4.1.0). Here is the result:
csv file:
c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12,c13
"l cells, eosinophils and macrophages in C57BL/6 mice, TCR-δ−",2,97,20050008627-3316T1503204,20050008627,CELLTYP,CL0000235,macrophage,out3289.xml,18395,17142,macrophages,ll
Cypher query:
LOAD CSV WITH HEADERS FROM 'file:///iv.csv' as line
return line
Result:
{
"c11": "17142",
"c10": "18395",
"c13": "ll",
"c12": "macrophages",
"c1": "l cells, eosinophils and macrophages in C57BL/6 mice, TCR-δ−",
"c2": "2",
"c3": "97",
"c4": "20050008627-3316T1503204",
"c5": "20050008627",
"c6": "CELLTYP",
"c7": "CL0000235",
"c8": "macrophage",
"c9": "out3289.xml"
}
Hope this works for you.
All the sessions of the conference are now available online