cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Load CSV with multiple relations one on the same two nodes

oli
Graph Buddy

I actually wish to load csv file in a way that limit number of nodes but multiple relations. for example I have only 1 subject and let's say 2 objects, between the subject and objects, I have multiple types of relations, so maybe my file looks like this:

s     p     o
-------------
a rel_type1 b
a rel_type2 b
a rel_type3 b
a rel_type4 b
a rel_type2 c
a rel_type3 c
a rel_type6 c

In this case, how can I create the graph only one subject node, two object nodes and multiple relations. I tried use toy case to create nodes first and merge the relations one by one and it works, but if I have large csv files, it will not be smart and I wish to automate the process.

Thanks
O

1 ACCEPTED SOLUTION

Ok, so you must use create instead of merge:

LOAD CSV WITH HEADERS from "file:///C:/test.csv" AS line
WITH line.subject AS subject, line.object AS object, line.relation_type AS relation_type, line.start_time AS start_time, line.end_time AS end_time
MATCH (s:Subject {name: subject})
MATCH (o:Object {name: object})
CALL apoc.create.relationship(s, relation_type, {start_time: start_time, end_time: end_time}, o)
YIELD rel
RETURN rel;

View solution in original post

5 REPLIES 5

Hello @oli

First, you must create UNIQUE CONSTRAINTS:

CREATE INDEX index_subject_name IF NOT EXISTS FOR (n:Subject) ON (n.name)
CREATE INDEX index_object_name IF NOT EXISTS FOR (n:Object) ON (n.name)

Then, you load Subject nodes:

LOAD CSV WITH HEADERS from "file:///C:/test.csv" AS line
WITH DISTINCT line.subject AS subject
MERGE (:Subject {name: subject})

Then, you load Object nodes:

LOAD CSV WITH HEADERS from "file:///C:/test.csv" AS line
WITH DISTINCT line.object AS object
MERGE (:Object {name: object})

To finish, you load relations (you will need the APOC plugin):

LOAD CSV WITH HEADERS from "file:///C:/test.csv" AS line
WITH line.subject AS subject, line.object AS object, line.relation_type AS relation_type
MATCH (s:Subject {name: subject})
MATCH (o:Object {name: object})
CALL apoc.merge.relationship(s, relation_type, {}, {}, o, {})
YIELD rel
RETURN rel

You have to use 3 requests to optimize everything but it is possible to do only one but it will be slower for large files.

Regards,
Cobra

Hi Maxime, Thanks for your answer. I understand the first 3 steps, but for the 4th step, I don't quite understand, especially

CALL apoc.merge.relationship(s, relation_type, {}, {}, o, {})

And what if my relation is a bit complex, for example, if my relations is defined as followings

[r:takePlace {start_time:line.start_time, end_time:line.end_time}]

So, in this case there can be different takePlace at different time between my subject and object, and how could I handle this

Do you want to have the same relation between two nodes but with different properties?

Yeah, the relation is for example :takePlace and the difference is the time (start_time, end_time)

Ok, so you must use create instead of merge:

LOAD CSV WITH HEADERS from "file:///C:/test.csv" AS line
WITH line.subject AS subject, line.object AS object, line.relation_type AS relation_type, line.start_time AS start_time, line.end_time AS end_time
MATCH (s:Subject {name: subject})
MATCH (o:Object {name: object})
CALL apoc.create.relationship(s, relation_type, {start_time: start_time, end_time: end_time}, o)
YIELD rel
RETURN rel;