Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-06-2021 11:14 AM
Hello, I am using Neo4j version 4.2.1. I am trying to load some data from a csv file into the target Neo4j database and I am wondering how to aggregate / massage the below data before I load it. Below is a sample csv file format and the requirements on how it needs to be done:
csv.file
A B C D E F
'M10344' 24 14 2 'Date' 12
Below is the cypher i was using for non-aggregation columns. Can you customize it for the above requirements ?
'CALL apoc.periodic.iterate( "CALL apoc.load.csv('csv.file.csv') YIELD map WITH map where map.A<>'' return map ", "MERGE (Node1:Node1{A:map.A}) ON CREATE SET Node1.B=map.B,....... ", {batchSize:100, parallel:false})
Thank you
02-06-2021 12:12 PM
USING PERIODIC COMMIT 100
LOAD CSV WITH HEADERS FROM 'file:///file.csv' AS row
CREATE (n:Node)
SET n.x = substring(row.A,1}
SET n.y = toInteger(row.B) + toInteger(row.C) - toInteger(row.F)
SET n.z = toInteger(row.B) + toInteger(row.C) * toInteger(row.D)
SET n.timeStamp = timeStamp()
Here Is an simplify option, APOC is indeed the way to go only if you have a crazy amount of data or more complex operations to do. A lot more options exists to import data but it's a good start.
NOTE: The file.csv must be in the import folder of your database or accessible from an http link if you did not set otherwise in your configuration.
Limit the commit to 100 might make the creation process slower for no benefit.
1000 is the default recommended value.
02-07-2021 09:53 AM
Thank you Gabriel. Below is what I am doing but it does not load for some reason. Can you please tell me what I may be doing not right ?
CALL apoc.periodic.iterate( "CALL apoc.load.csv('WORK_ORDER_MODATA.csv') YIELD map WITH map where map.ORDER_NUMBER<>'' return map ", "MERGE (WOLine:WOLine{WOLineOriginationID:map.ORDER_NO}) ON CREATE SET WOLine.DateTimeImport=datetime(),WOLine.WOLineCreatedate=map.RECORD_CREATED_DT,WOLine.WOLineUpdatedate=map.RECORD_CHANGED_DT,WOLine.WOLineID=map.SCHEDULED_ITEM_NO,WOLine.WOLineTypeID=NULL,WOLine.WOLineOriginationTyID=substring(map.ORDER_NO,1),WOLine.WOLineDemandTypeID=NULL,WOLine.WOLineDemandID=map.JOB_NO,WOLine.WOLineOriginationConfigID=map.APC_CONFIG,WOLine.WOLineStatusCdID=NULL,WOLine.WOLineStatusCode=NULL,WOLine.WOLineRevReasonCd=NULL,WOLine.WOLineRevReasonCdID=NULL,WOLine.WOLineRevReasonDate=NULL,WOLine.WOLineOperationNumber=map.OPERATION_WHERE_USED,WOLine.WOLineCreatedate=map.SRC_UPDATED_DT,WOLine.WOLineDueDate=map.REQUIRED_DT,WOLine.WOLineIssuedate=map.LAST_ISSUE_DT,WOLine.WOLineClosedate=map.CLOSEOUT_DT,WOLine.WOLineRevDate=map.SRC_UPDATED_DT,WOLine.WOLineStatusUptdate=NULL,WOLine.WOLineDesc=NULL,WOLine.WOLineCurrencyID=NULL,WOLine.WOLineCostMaterialIssued=toInteger(map.UNIT_COST) * (toInteger(map.ISSUE_QTY_TOTAL_TO_DATE) + toInteger(map.COMPONENT_SCRAP)),WOLine.WOLineReqQty=map.TOTAL_QTY_REQUIRED,WOLine.WOLineIssuedQty=map.ISSUE_QTY_TOTAL_TO_DATE,WOLine.WOLineOutstdngQty=toInteger(map.TOTAL_QTY_REQUIRED)-toInteger(map.ISSUE_QTY_TOTAL_TO_DATE)+toInteger(map.COMPONENT_SCRAP),WOLine.WOLineScrpQty=map.COMPONENT_SCRAP ", {batchSize:100, parallel:false})
02-07-2021 02:14 PM
No worries. I figured it. It was some corruption going on with the csv and I started reading it using list map [0] based on the index position of the field. Thank you
02-08-2021 08:30 AM
Here's a CSV lint program that can check the validity of your data.
02-08-2021 06:35 PM
Thank you guys. Another quick question? If I were to map a non-existing column in my csv file to NULL , I do it something like Node.TargetAttribute=NULL. In the same way if I have to default all the values to '0' , how do you recommend doing it ? Thanks much in advance
~
Nithin
All the sessions of the conference are now available online