Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
10-28-2020 02:41 PM
LOAD CSV WITH HEADERS FROM 'file:///orders.csv' AS row
MERGE (order:Order {orderID: row.OrderID})
ON CREATE SET order.shipName = row.ShipName;
If the Order has a lot of columns, the CREATE SET statement would be very long. Importantly, each property name and value have to be specified explicitly. If there a way to do the SET all at once without explicitly specifying each one? I can do this by bolt driver without using LOAD CSV. I read the data into a dictionary and then know all the column names and do a loop with the generic form of:
ON CREATE SET $label.propertyName=$property_value
The driver way is slow, but very flexible. For the LOAD CSV, is there a way to load all columns in a similar way? If the only way to access the column names is by something like 'row.OrderID', probably this is not possible.
Solved! Go to Solution.
11-03-2020 05:44 AM
This would work, potentially creating Companies without an email address.
LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
MERGE (c:Company {companyId: row.Id})
SET c += row { .*, emailAddress: CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END }
If you want to avoid loading any part of the row if the email is blank. Then you would need to move the case above the merge.
LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
WITH row, CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END AS emailAddress
WHERE emailAddress IS NOT NULL
MERGE (c:Company {companyId: row.Id})
SET c += row { .*, emailAddress: emailAddress }
10-28-2020 03:07 PM
I've asked myself this question in the past as well, but your description about treating the data as a dictionary made me realise that the row is really the same thing already (or perhaps technically a map). Either way, this should do what you want:
...
ON CREATE SET order += row
10-28-2020 03:40 PM
This looks cool, and I will try.
10-29-2020 06:09 AM
ON CREATE SET order += row
Is what we've always used and works well.
BONUS info so I'm not just repeating @terryfranklin82
If you want to alter or change some data you can override the values in an object map.
For example if you want all but one of the fields.
ON CREATE SET order += row { .*, columnToIgnore: null }
10-29-2020 12:37 PM
Thanks, it's a feature that I didn't know about, but could you provide a working example?
If I run something like:
WITH { a:1, b:2, c:3 } as row
MATCH (d:Destination{id:123})
SET d += row { .*, columnToIgnore:row.a}
RETURN d.a, d.b, d.c
all columns are still set:
d.a d.b d.c
1 2 3
10-30-2020 07:23 AM
You would need to do.
SET d+= row { .*, a:null }
a|b|c are your object keys for row, by setting it to null, it overrides the a from .* in the map. Neo4j doesn't write null value properties to the graph, so the a
property is ignored.
Guess I should add this to the list of blog post ideas.
10-30-2020 02:09 PM
Now I understand - I thought columnToIgnore
was a control word, but you were using it as a placeholder for "the columns you want to ignore".
Please share a link to your blog, I'd be interested to check it out
11-02-2020 05:37 AM
@terryfranklin82 yep, I get pretty short-hand sometimes and it's a known problem that it's not always clear to anyone else what I mean It's something I work on daily lol.
I don't actually have a blog yet, just a ton of post topics/ideas sitting in my notebook! I'll share it once I make one
11-02-2020 03:44 PM
@mdfrenchman, @terryfranklin82 A follow-up question:
If I want to change the empty string to avoid load, how to do it? I want to do it roughly like this example:
LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
MERGE (c:Company {companyId: row.Id})
SET c.emailAddress = CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END
So if I want to change empty string from any field, is that possible for the generic version of the:
SET d+= row { .*, a:null }
11-03-2020 05:44 AM
This would work, potentially creating Companies without an email address.
LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
MERGE (c:Company {companyId: row.Id})
SET c += row { .*, emailAddress: CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END }
If you want to avoid loading any part of the row if the email is blank. Then you would need to move the case above the merge.
LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
WITH row, CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END AS emailAddress
WHERE emailAddress IS NOT NULL
MERGE (c:Company {companyId: row.Id})
SET c += row { .*, emailAddress: emailAddress }
11-05-2020 09:00 AM
@mdfrenchman, For this last solution, is there a generic solution to omit all node properties that are empty strings? Each node type has different properties, and it would be hard to explicitly check the null/empty of each property. Not only email, any property except ID value could be absent. Currently, my code create empty string property value like this when loading from csv:
"foreign_name": ""
This is my code:
CALL apoc.load.csv('coach.csv')
yield map as row
MERGE (m:Coach {ID: row.ID}) ON CREATE SET m += row ON MATCH SET m += row
RETURN count(m) as mcount
How to omit empty string property before this MERGE statement?
Given the example above:
SET c += row { .*, emailAddress: CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END }
I need a way to somehow create a new row which sets empty strings as null, not only for an explicit property, i.e. row.Email. I tried adding a WITH clause before the MERGE, but it reports an error:
WITH = "\nWITH [(key, value) in row WHERE value = '' | row.key=NULL] as row"
11-05-2020 10:05 AM
I will create a separate question for this follow-up related question.
All the sessions of the conference are now available online