Neo4j

matt8mb · ‎02-07-2022

Hello Neo4J community,

I am very new to Neo4j and am struggling with an error.
I am trying to import a CSV file which has NULL values.
When I enter the Cypher, I get this error: Cannot merge the following node because of null property value for 'Name':

How do I fix this?

Here is the Cypher:

LOAD CSV WITH HEADERS from 'file:///tree.csv' as row with row where row.company is not null
MERGE (b:country {Name:row.country})
MERGE (a:company {Name:row.company})
MERGE (c:plot_id {Name:row.plot_id})
MERGE (d:creation_date {Name:row.creation_date})
MERGE (e:tree_id {Name:row.tree_id})
MERGE (f:tree_height {Name:row.tree_height})
MERGE (g:crown_depth {Name:row.crown_depth})
MERGE (h:trunk_structure {Name:row.trunk_structure})
MERGE (i:single_trunk_circumference {Name:row.single_trunk_circumference})
MERGE (j:branch_number {Name:row.branch_number})
MERGE (k:multiple_trunk_diameter {Name:row.multiple_trunk_diameter})
MERGE (l:multiple_trunk_circumference {Name:row.multiple_trunk_circumference})
MERGE (a) -[:TO]-> (b)
MERGE (b) -[:TO]-> (c)
MERGE (c) -[:TO]-> (d)
MERGE (d) -[:TO]-> (e)
MERGE (e) -[:TO]-> (f)
MERGE (f) -[:TO]-> (g)
MERGE (g) -[:TO]-> (h)
MERGE (h) -[:TO]-> (i)
MERGE (j) -[:TO]-> (k)

Thank you!

matt8mb · ‎02-07-2022

Alternatively, I have tried doing this, which hasn't worked either.

LOAD CSV WITH HEADERS from 'file:///tree.csv' as row with row where row.single_trunk_circumference is not null AND where row.branch_number is not null AND where row.multiple_trunk_diameter is not null AND where row.multiple_trunk_circumference is not null

MERGE (b:country {Name:row.country})

MERGE (a:company {Name:row.company})

MERGE (c:plot_id {Name:row.plot_id})

MERGE (d:creation_date {Name:row.creation_date})

MERGE (e:tree_id {Name:row.tree_id})

MERGE (f:tree_height {Name:row.tree_height})

MERGE (g:crown_depth {Name:row.crown_depth})

MERGE (h:trunk_structure {Name:row.trunk_structure})

MERGE (i:single_trunk_circumference {Name:row.single_trunk_circumference})

MERGE (j:branch_number {Name:row.branch_number})

MERGE (k:multiple_trunk_diameter {Name:row.multiple_trunk_diameter})

MERGE (l:multiple_trunk_circumference {Name:row.multiple_trunk_circumference})

MERGE (a) -[:TO]-> (b)

MERGE (b) -[:TO]-> (c)

MERGE (c) -[:TO]-> (d)

MERGE (d) -[:TO]-> (e)

MERGE (e) -[:TO]-> (f)

MERGE (f) -[:TO]-> (g)

MERGE (g) -[:TO]-> (h)

MERGE (h) -[:TO]-> (i)

MERGE (j) -[:TO]-> (k)

glilienfield · ‎02-07-2022

Try removing the WHERE clause before the AND clauses so you have WHERE row.a is not null AND row.b is not null AND row.c is not null, etcs

Did your statement run without error with that syntax?

matt8mb · ‎02-08-2022

Thank you for your answer.
I have removed the ANDs, but it does not run anything... Do you have any ideas why this is?

Uploading: Screen Shot 2022-02-08 at 3.36.45 PM.png...

glilienfield · ‎02-08-2022

The screenshot did not render. Also, can you provide the test data, or is it proprietary?

matt8mb · ‎02-08-2022

This is part of the data, (I can't show more than this):

Thank you for your help

matt8mb · ‎02-08-2022

glilienfield · ‎02-08-2022

The MERGE clause will not create new nodes after you have run the script once, so no changes. Try deleting everything and running it.

Also, what is the warning the browser is giving you?

matt8mb · ‎02-08-2022

I've tried it but it is not working again.
Additionally, I check if the csv was properly copied in the imported file, which it is

glilienfield · ‎02-08-2022

I just noticed in the records you are showing, that no row has all four attributes as not null, so the WHERE clause will be false for the rows shown (1-26).

Perhaps the data model is not correct. Are each column supposed to be a separate entity, or are some attributes of an aggregate entity, as "Tree?" The attributes in columns F-L would be attributes of a Tree node that has id = tree_id. No?

matt8mb · ‎02-08-2022

Yes that is correct. No row has all four attributes as not null.
You are also correct. Each columns are attributes of an entity Tree (of individual tree_id).

Does that mean that I am supposed to use OR ?
Just as a reminder, I am required to represent everything from the basis of the company (which company owns which tree and all their attributes).

glilienfield · ‎02-08-2022

Seems like you want something like the following:

If true, you don't want the merges for f-l, as those are attributes you sent on node e.

Also, node d is not need, as creation_date is a property of another node, plot or tree?

matt8mb · ‎02-08-2022

This is exactly what I need.
Once I have this, how do I import all the attributes (from f to l) of the tree in the node? As in, how to I set multiple labels on the tree?

I would want Company -> Country -> Tree_ID (with all the labels: creation date, height etc.).

Thank you very much for your time.

glilienfield · ‎02-08-2022

Is the plot_id an attribute of a tree, or is plot an entity itself that has multiple trees associated to it?

matt8mb · ‎02-08-2022

I am not sure what it is and decided I will not use it in the graph.

matt8mb · ‎02-08-2022

I have been looking up how to put multiple label from a csv document, but cannot find anything. Do you have any ideas?

ameyasoft · ‎02-08-2022

Try this. Use COALESCE to handle null values.
LOAD CSV WITH HEADERS from 'file:///tree.csv' as row 

MERGE (b:Country {Name:row.country})

MERGE (a:Company {Name:row.company})

MERGE (c:Plot {Name:row.plot_id, creationDate:row.creation_date}})

MERGE (e:Tree{Name:row.tree_id, height:row.tree_height, crownDepth:row.crown_depth, trunkStructure: COALESCE(row.trunk_structure, 'NA'), singleTrunkCircumference:COALESCE(row.single_trunk_circumference, 'NA'), branchNumber: COALESCE(row.branch_number, 'NA'), multipleTrunkDiameter: COALESCE(ow.multiple_trunk_diameter, 'NA'), multipleTrunkCircumference: COALESCE(row.multiple_trunk_circumference, 'NA')})

MERGE (b)-[:COMPANY]->(a)
MERGE (a)-[:PLOT]->(c)
MERGE (c)-[:TREE]->(e)

glilienfield · ‎02-08-2022

@ameyasoft provide a solution. I would just recommend using HAS_COMPANY, HAS_PLOT, and HAS_TREE for the relationship types. Also, eliminate the Plot node if you don't want it; merging Company to Tree directly.

matt8mb · ‎02-08-2022

Thank you both really much for your help.

Here is the Cypher used:
LOAD CSV WITH HEADERS from 'file:///tree.csv' as row with row where row.tree_id is not null

MERGE (b:Country {Name:row.country})

MERGE (a:Company {Name:row.company})

MERGE (c:TREE {Name:row.tree_id, date:row.creation_date, height:row.tree_height, crownDepth:row.crown_depth, trunkStructure: COALESCE(row.trunk_structure, 'NA'), singleTrunkCircumference:COALESCE(row.single_trunk_circumference, 'NA'), branchNumber: COALESCE(row.branch_number, 'NA'), multipleTrunkDiameter: COALESCE(row.multiple_trunk_diameter, 'NA'), multipleTrunkCircumference: COALESCE(row.multiple_trunk_circumference, 'NA')})

MERGE (b)-[:HAS_COMPANY]->(a)
MERGE (c)-[:HAS_COUNTRY]->(b)

It worked to some extent. However, it only shows the results for one company (there are two in total).
Any suggestions?

ameyasoft · ‎02-08-2022

Same query works for any number Countries/Companies. Think of MERGE as attempting a MATCH on the pattern, and if no match is found, a CREATE of the pattern. Each Country node will have it's own child nodes. Run the same code for two or more sets of data. If your .csv file has data for two (or more) countries you will see that many number of subgraphs.

Please change this:
MERGE (c)-[:HAS_COUNTRY]->(b)
to
MERGE (a)-[:HAS_TREE]->(c)
 
Final relationships:
MERGE (b)-[:HAS_COMPANY]->(a)
MERGE (a)-[:HAS_TREE]->(c)

(b)-[]->(a)-[]->(c)

glilienfield · ‎02-08-2022

The relationships seem odd to me. I read it as stating a 'tree has a country and a country has a company.' Is it not more like 'a country has a company and a company has a tree?'

The query looks correct. Does your data have a second country in the csv file? If so, is the country on all rows that have tree_id that is null, so it is getting filtered out?

Neo4j

Importing a CSV doc "Cannot merge the following node because of null property value for 'Name': " error