Neo4j

lingvisa · ‎10-28-2020

LOAD CSV WITH HEADERS FROM 'file:///orders.csv' AS row
MERGE (order:Order {orderID: row.OrderID})
  ON CREATE SET order.shipName = row.ShipName;

If the Order has a lot of columns, the CREATE SET statement would be very long. Importantly, each property name and value have to be specified explicitly. If there a way to do the SET all at once without explicitly specifying each one? I can do this by bolt driver without using LOAD CSV. I read the data into a dictionary and then know all the column names and do a loop with the generic form of:
ON CREATE SET $label.propertyName=$property_value

The driver way is slow, but very flexible. For the LOAD CSV, is there a way to load all columns in a similar way? If the only way to access the column names is by something like 'row.OrderID', probably this is not possible.

mdfrenchman · ‎11-03-2020

This would work, potentially creating Companies without an email address.

LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
MERGE (c:Company {companyId: row.Id})
SET c += row { .*, emailAddress: CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END }

If you want to avoid loading any part of the row if the email is blank. Then you would need to move the case above the merge.

LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
WITH row, CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END AS emailAddress
WHERE emailAddress IS NOT NULL
MERGE (c:Company {companyId: row.Id})
SET c += row { .*, emailAddress: emailAddress }

View solution in original post

terryfranklin82 · ‎10-28-2020

I've asked myself this question in the past as well, but your description about treating the data as a dictionary made me realise that the row is really the same thing already (or perhaps technically a map). Either way, this should do what you want:

...
ON CREATE SET order += row

lingvisa · ‎10-28-2020

This looks cool, and I will try.

mdfrenchman · ‎10-29-2020

ON CREATE SET order += row

Is what we've always used and works well.

BONUS info so I'm not just repeating @terryfranklin82

If you want to alter or change some data you can override the values in an object map.
For example if you want all but one of the fields.

ON CREATE SET order += row { .*, columnToIgnore: null }

terryfranklin82 · ‎10-29-2020

Thanks, it's a feature that I didn't know about, but could you provide a working example?

If I run something like:

WITH { a:1, b:2, c:3 } as row
MATCH (d:Destination{id:123})
SET d += row { .*, columnToIgnore:row.a}
RETURN d.a, d.b, d.c

all columns are still set:

d.a	d.b	d.c
1	2	3

mdfrenchman · ‎10-30-2020

You would need to do.

SET d+= row { .*, a:null }

a|b|c are your object keys for row, by setting it to null, it overrides the a from .* in the map. Neo4j doesn't write null value properties to the graph, so the a property is ignored.

Guess I should add this to the list of blog post ideas.

terryfranklin82 · ‎10-30-2020

Now I understand - I thought columnToIgnore was a control word, but you were using it as a placeholder for "the columns you want to ignore".

Please share a link to your blog, I'd be interested to check it out

mdfrenchman · ‎11-02-2020

@terryfranklin82 yep, I get pretty short-hand sometimes and it's a known problem that it's not always clear to anyone else what I mean It's something I work on daily lol.

I don't actually have a blog yet, just a ton of post topics/ideas sitting in my notebook! I'll share it once I make one

lingvisa · ‎11-02-2020

@mdfrenchman, @terryfranklin82 A follow-up question:

If I want to change the empty string to avoid load, how to do it? I want to do it roughly like this example:

LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
MERGE (c:Company {companyId: row.Id})
SET c.emailAddress = CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END

So if I want to change empty string from any field, is that possible for the generic version of the:
SET d+= row { .*, a:null }

mdfrenchman · ‎11-03-2020

This would work, potentially creating Companies without an email address.

LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
MERGE (c:Company {companyId: row.Id})
SET c += row { .*, emailAddress: CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END }

If you want to avoid loading any part of the row if the email is blank. Then you would need to move the case above the merge.

LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
WITH row, CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END AS emailAddress
WHERE emailAddress IS NOT NULL
MERGE (c:Company {companyId: row.Id})
SET c += row { .*, emailAddress: emailAddress }

lingvisa · ‎11-05-2020

@mdfrenchman, For this last solution, is there a generic solution to omit all node properties that are empty strings? Each node type has different properties, and it would be hard to explicitly check the null/empty of each property. Not only email, any property except ID value could be absent. Currently, my code create empty string property value like this when loading from csv:
"foreign_name": ""

This is my code:

CALL apoc.load.csv('coach.csv') 
yield map as row 
MERGE (m:Coach {ID: row.ID}) ON CREATE SET m += row ON MATCH SET m += row
RETURN count(m) as mcount

How to omit empty string property before this MERGE statement?

Given the example above:
SET c += row { .*, emailAddress: CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END }
I need a way to somehow create a new row which sets empty strings as null, not only for an explicit property, i.e. row.Email. I tried adding a WITH clause before the MERGE, but it reports an error:
WITH = "\nWITH [(key, value) in row WHERE value = '' | row.key=NULL] as row"

lingvisa · ‎11-05-2020

I will create a separate question for this follow-up related question.

Neo4j

Is there a way to load all columns without explicit specification?