cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Connecting one node to multiple

Hello,

My problem is that whenever I try to import data via a large csv and connect to a central node, it seems to make many copies of the central node (see photo). The brown nodes are what I want connected to just one "Earth Justice" node. I realize I can merge duplicate nodes, but I would like to have it right as I load in.

My code is as follows:

:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid) SET l+=row
CREATE (o:Origin {name:"Earth Justice"})
MERGE (l)<-[:Created]-(o)

Thanks in advance.

1 ACCEPTED SOLUTION

Benoit_d
Graph Buddy

Hi,

in order to insure that MERGE is recognizing the node you are addressing, you should instore a uniqness constraint on one property of the node-label before loading the data.

CREATE CONSTRAINT ON ( orig:Origin)  ASSERT org.name IS UNIQUE

then you will be able to MERGE this as "Origin" labelled node without recreating it, which means the first appearance of "Earth Justice" will create the node, all other will merge.

:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid) SET l+=row
MERGE (o:Origin {name:"Earth Justice"})
MERGE (l)<-[:Created]-(o)

Pay attention, that this means the node (o:Origin {name:"Earth Justice"}) should come from the file, which is not the case in your cypher: no reference to any column.
If the origin node allready exists, just make a match:

:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid) SET l+=row
MATCH (o:Origin {name:"Earth Justice"})
MERGE (o)-[:Created]->(l)

In some case, for instance if you are delivering a new origin, to which every node of the source have to be connected, but the name of this node already exist, you will have to

  • create a temporary label e.g. tempOrigin,
  • create a constraint on this label,
  • load the data
  • destroy the constraint (change "Create Constraint on ..." into "Drop Constraint on ...")
  • relabel all nodes with label tempOrigin (should be only one) to label Origin
  • delabel all nodes tempOrigin

A piece of cake 😉

View solution in original post

7 REPLIES 7

MERGE can be a bit confusing to use, I agree, there are nuances that even the experienced run into again (and again). At a glance I think maybe you are explicitly creating the nodes (and creating duplicates after the first time)? Perhaps the MERGE statement is ok.

Reference:

ameyasoft
Graph Maven
Try this:
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row

MATCH (o:Origin {name:"Earth Justice"})
with o
CREATE (l:labid) SET l+=row
MERGE (l)<-[:Created]-(o)

After copying and pasting your example, I get this error message.

Variable `row` not defined (line 5, column 25 (offset: 130))
"CREATE (l:labid) SET l+=row"

Doesn't quite make sense with me why that doesn't work.

Benoit_d
Graph Buddy

Hi,

in order to insure that MERGE is recognizing the node you are addressing, you should instore a uniqness constraint on one property of the node-label before loading the data.

CREATE CONSTRAINT ON ( orig:Origin)  ASSERT org.name IS UNIQUE

then you will be able to MERGE this as "Origin" labelled node without recreating it, which means the first appearance of "Earth Justice" will create the node, all other will merge.

:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid) SET l+=row
MERGE (o:Origin {name:"Earth Justice"})
MERGE (l)<-[:Created]-(o)

Pay attention, that this means the node (o:Origin {name:"Earth Justice"}) should come from the file, which is not the case in your cypher: no reference to any column.
If the origin node allready exists, just make a match:

:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid) SET l+=row
MATCH (o:Origin {name:"Earth Justice"})
MERGE (o)-[:Created]->(l)

In some case, for instance if you are delivering a new origin, to which every node of the source have to be connected, but the name of this node already exist, you will have to

  • create a temporary label e.g. tempOrigin,
  • create a constraint on this label,
  • load the data
  • destroy the constraint (change "Create Constraint on ..." into "Drop Constraint on ...")
  • relabel all nodes with label tempOrigin (should be only one) to label Origin
  • delabel all nodes tempOrigin

A piece of cake 😉

Thanks, appreciate the help. The Constraints was a good tip.

One more thing: For the 3rd block of code, the error message I get is that

WITH is required between SET and MATCH 

The "With" doesnt execute the code, however. Thanks again for the help!

Benoit_d
Graph Buddy

move the SET to the end:

:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid) 
MATCH (o:Origin {name:"Earth Justice"})
MERGE (o)-[:Created]->(l)
SET l+=row

or make a "matched merge":

:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid) 
MERGE (o:Origin {name:"Earth Justice"})-[:Created]->(l)
SET l+=row