Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-15-2021 11:39 AM
Hi all,
I am new to Neo4j or even graph databases for that matter. I am struggling with building my first graph. I would be very if the community would like to assist me with this basic question. I can not get it to work when following the instructions: How-To: Import CSV Data with Neo4j Desktop - Developer Guides
My case also involves three csv-files (two entities and one relationship) The csv's have the following layout:
CSV1: entity1_id, entity1_name
CSV2: entity2_id, entity2_name
CSV3: id, entity1_id, entity2_id (many:many relationship)
I would create a relationship using CSV3, however the result when running the following cypher code:
LOAD CSV WITH HEADERS FROM 'file:///CSV3.csv' AS row
WITH toInteger(row.id) AS relation_id, toInteger(row.entity1_id) AS entity1_id, toInteger(row.entity2_id) AS entity2_id
MATCH (e1:CSV1 {entity1_id:entity1_id})
MATCH (e2:CSV2 {entity2_id:entity2_id})
MERGE (e1)-[rel:CONTAINS]->(e2)
RETURN count(rel);
This yield 0 relationships. Could someone help me out on setting the correct relationships between the entities?
Thanks in advance!
02-15-2021 04:29 PM
Hi,
Based on your query you are trying to MATCH
on the nodes you want to connect. The question I have is do they exist yet? If not, try switching to use MERGE
instead of match. The relationships are likely not being created if the nodes don't exist.
02-16-2021 12:32 AM
Hi Michael,
Thank you for your prompt response. I used something along the following Cypher to construct the nodes (according to the mentioned link).
LOAD CSV WITH HEADERS FROM 'file:///orders.csv' AS row
WITH toInteger(row.orderID) AS orderId, datetime(replace(row.orderDate,' ','T')) AS orderDate, row.shipCountry AS country
RETURN orderId, orderDate, country
LIMIT 5;
I assume this constructs a node? Or am I being ignorant?
02-16-2021 04:56 AM
Unfortunately it looks like you never actually created the nodes; you only returned the values from the CSV to the browser. You need to use CREATE
or MERGE
to create the nodes. Something like:
LOAD CSV WITH HEADERS FROM 'file:///orders.csv' AS row
WITH toIntegeer(row.orderID) AS orderId,
datetime(replace(row.orderDate(, ' ', 'T')) AS orderDate
row.shipCountry AS country
CREATE (order:Order {orderId: orderId}) // MERGE would also work here assuming each order number is unique
SET order.orderDate = orderDate,
order.country = country
One things to consider with converting the order ID to an integer is that it will remove leading zeros. That may or many not be an issue depending on how you want to use that property. Just something to think about.
02-16-2021 11:54 AM
Thank you! Can I ask you a final question? My script:
// Set constraints on unique keys of files.
CREATE CONSTRAINT UniqueInformatiedomeinGroep ON (idg:Informatiedomeingroep) ASSERT idg.id IS UNIQUE;
CREATE CONSTRAINT UniqueInformatiedomein ON (id:Informatiedomein) ASSERT id.id IS UNIQUE;
CREATE CONSTRAINT UniqueBedrijfsactiviteit ON (ba:Bedrijfsactiviteit) ASSERT ba.id IS UNIQUE;
CREATE CONSTRAINT UniqueInformatieObject ON (io:Informatieobject) ASSERT io.id IS UNIQUE;
// Load informatiedomein_groepen nodes.
LOAD CSV WITH HEADERS FROM 'file:///informatiedomein_groepen.csv' AS row
WITH toInteger(row.id) AS groepId
, row.groep AS groepNaam
MERGE (idg:Informatiedomeingroep {groepId: groepId})
SET idg.groepNaam = groepNaam;
// Load informatiedomeinen nodes.
LOAD CSV WITH HEADERS FROM 'file:///informatiedomeinen.csv' AS row
WITH toInteger(row.id) AS informatiedomeinId
, row.informatiedomein AS informatiedomeinNaam
, row.informatiedomein_groep_id AS informatiedomeinGroepId
MERGE (id:Informatiedomein {informatiedomeinId: informatiedomeinId})
SET id.informatiedomeinNaam = informatiedomeinNaam
, id.informatiedomeinGroepId = informatiedomeinGroepId;
// Load bedrijfsactiviteiten nodes.
LOAD CSV WITH HEADERS FROM 'file:///bedrijfsactiviteiten.csv' AS row
WITH toInteger(row.id) AS bedrijfsactiviteitId
, row.bedrijfsactiviteit AS bedrijfsactiviteitNaam
, toInteger(row.informatiedomein_id) AS informatiedomeinId
MERGE (ba:Bedrijfsactiviteit {bedrijfsactiviteitId: bedrijfsactiviteitId})
SET ba.bedrijfsactiviteitNaam = bedrijfsactiviteitNaam
, ba.informatiedomeinId = informatiedomeinId;
// Load informatieobjecten nodes.
LOAD CSV WITH HEADERS FROM 'file:///informatieobjecten.csv' AS row
WITH toInteger(row.id) AS informatieobjectId
, row.informatieobject AS informatieobjectNaam
, toInteger(row.informatiedomeinId) AS informatiedomeinId
MERGE (io:Informatieobject {informatieobjectId: informatieobjectId})
SET io.informatieobjectNaam = informatieobjectNaam
, io.informatiedomeinId = informatiedomeinId;
// Create relationships.
MATCH (io:Informatieobject)
, (ba:Bedrijfsactiviteit)
, (idg:Informatiedomeingroep)
, (id:Informatiedomein)
WHERE idg.groepId = id.informatiedomeinGroepId
AND id.informatiedomeinId = ba.informatiedomeinId
AND id.informatiedomeinId = io.informatiedomeinId
CREATE (idg)-[cid:CONTAINS_ID]->(id)
, (id)-[cba:CONTAINS_BA]->(ba)
, (id)-[cio:CONTAINS_IO]->(io)
RETURN idg, id, ba, io;
However, the relationships returns 0 changes, what am I doing wrong?
02-16-2021 02:36 PM
It's a bit tough to follow with how you have labeled everything. But I would say try creating the relationships in 3 separate queries and see if that helps.
Have you queried the nodes after each node creating step to see if the properties are as you intended?
02-16-2021 06:20 PM
Run this query and check the results:
MATCH (io:Informatieobject)
, (ba:Bedrijfsactiviteit)
, (idg:Informatiedomeingroep)
, (id:Informatiedomein)
WHERE idg.groepId = id.informatiedomeinGroepId
AND id.informatiedomeinId = ba.informatiedomeinId
AND id.informatiedomeinId = io.informatiedomeinId
RETURN io, ba, idg, id
02-17-2021 12:09 AM
Hi,
I got the query to work. After mckenzma's advice on breaking the relationships up in three separate queries, I was able to debug the script. After inspecting the properties of the nodes that I trying to match, I saw that one of the properties was a different datatype. Which I overlooked. I am grateful for your assistance!
02-17-2021 05:55 AM
I am glad you were able to figure it out!
I tend to try and break up node and relationship creation into separate queries for that reason. It becomes easier to debug errors and issues. Especially for larger data sets, I have learned that running a query to create a set of nodes and then running a query to connect those nodes can be more performant than trying to do it at the same time. And it becomes easier to potentially run only parts of the queries you need to when/if something changes.
Good luck!
05-04-2021 12:38 PM
My recommendation, especially for Neo4J newbies is take it step by step and try on very small files that are samples of your data.
All the sessions of the conference are now available online