cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Creating the simplest possible graph

I'm scoping out Neo4j on the possibility that we might eventually deploy graph databases at work. Though I've not had much trouble following the tutorials, somehow I can't seem to make a simple graph with three barebones csv files.

Here's my region data.

3X_9_3_9362e487d1f22bcfa908e39b1ffe95b0132b38b3.png

My division data looks almost exactly the same.

Here's my relationship data.

The first two .csv files I loaded with:

LOAD CSV WITH HEADERS FROM 'file:///location_dim.csv' AS row
WITH row.REGION_NM AS REGION_NM
CREATE (n {reg: REGION_NM})
return n

and:

LOAD CSV WITH HEADERS FROM 'file:///divs.csv' AS node
WITH node.DIVISION_NM AS DIVISION_NM
CREATE (n {div: DIVISION_NM})
return n

The relationship I loaded in with:

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from {id: rels.REG}), (to {id: rels.DIV})
create (from)-[:REL {type: rels.`RELATIONSHIP`}]->(to)
return from, to

For the most part this is all taken from tutorials, but isn't working for me. There are two problems.

First, the data load just fine but the nodes aren't labeled.

And second, when I load in the relationships nothing actually happens.

What I want is a graph with region nodes, and region nodes with division nodes hanging off of them where indicated.

Unfortunately I haven't kept a record of every experiment I've tried, but I've altered the syntax in various small ways (not returning anything, using create (from)-[:hasDiv}]->(to) instead of create (from)-[:REL {type: rels.RELATIONSHIP}]->(to), etc.)

I've also read a non-trivial amount of the documentation and searched the forums for threads.

Any advice?

1 ACCEPTED SOLUTION

one last. if you continue, and if you have larger datasets so as to help the

match (from:Region {reg: rels.REG}), (to:Division {div: rels.DIV})

ideally you should create an index on :Region and :Division and on the properties reg and div respectively. See Indexes for search performance - Neo4j Cypher Manual for more details/syntax.

it should be noted that the relationship creation is in fact a creation so if you run the cypher to create the relationships 5 times, you might then get the same 5 relationships between 2 nodes. However if you change

create (from)-[:hasDiv]->(to)

to

merge (from)-[:hasDiv]->(to)

then merge acts as a create or replace. So if you run the script to create the relationships 5 times then you would expect to see no more than 1 :hasDiv relationship between 2 nodes

View solution in original post

9 REPLIES 9

First, the data load just fine but the nodes aren't labeled.

typcially labels are applied via a

create (n:<label>) { <properties>})

and thus

LOAD CSV WITH HEADERS FROM 'file:///divs.csv' AS node
WITH node.DIVISION_NM AS DIVISION_NM
CREATE (n:Division {div: DIVISION_NM})
return n

to load nodes with a a label named Division

And second, when I load in the relationships nothing actually happens.

you might change

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from {id: rels.REG}), (to {id: rels.DIV})
create (from)-[:REL {type: rels.`RELATIONSHIP`}]->(to)
return from, to

to

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {id: rels.REG}), (to:Division {id: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to

and presuming when you run LOAD CSV against the location_dim.csv that the

CREATE (n {reg: REGION_NM})

is changed to

CREATE (n:Region {reg: REGION_NM})

Dana,

Thanks so much for taking the time to answer. Your amended code did create labeled nodes, but unfortunately the relationship script didn't accomplish anything. I got a (no changes, no records) message.

I didn't forget to use CREATE (n:Region {reg: REGION_NM}) for the region script.

ok.. I think I see the problem.
To create the relationships your code is

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {id: rels.REG}), (to:Division {id: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to

but :Region nodes do not have a id property and :Division nodes also do not have a id property for when you created said nodes you created :Region labeled nodes with a property named reg and :Division labeled nodes with a div property.

This can be confirmed for example by running

match (n:Region) return n limit 3;
match (n:Division) return n limit 3;

which will return 3 :Region nodes and 3 :Division nodes.
And if I am correct then

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {id: rels.REG}), (to:Division {id: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to

should be rewritten as

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {reg: rels.REG}), (to:Division {div: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to

one last. if you continue, and if you have larger datasets so as to help the

match (from:Region {reg: rels.REG}), (to:Division {div: rels.DIV})

ideally you should create an index on :Region and :Division and on the properties reg and div respectively. See Indexes for search performance - Neo4j Cypher Manual for more details/syntax.

it should be noted that the relationship creation is in fact a creation so if you run the cypher to create the relationships 5 times, you might then get the same 5 relationships between 2 nodes. However if you change

create (from)-[:hasDiv]->(to)

to

merge (from)-[:hasDiv]->(to)

then merge acts as a create or replace. So if you run the script to create the relationships 5 times then you would expect to see no more than 1 :hasDiv relationship between 2 nodes

your image/screen shot. maybe start all over by

match (n:Division) detach delete n;
match (n:Region) detach delete n;

which will find all :Division node remove any associated relationships with said :Division node and then delete the node itself. And the 2nd statement will work on a :Region nodes in the same manner.

Upon running the 2 lines above all :Region and :Division nodes should be removed.

Then rerun the LOAD CSV statement and when creating :Division and :Region nodes change the create to a merge

sorry our post crossed.. but yes @trent.fowler your last post with all the merges is what should be used

You're awesome, thank you for the help!

Hey, that worked!

Now I just have to figure out what to do with all these duplicate divisions.

But it's progress

3X_f_f_ff42478953f1f1a8e2167b9a8d25e6b6cbf12b33.png

Good point! Replacing the CREATEs with MERGEs cleared that right up.

Here's everything, producing a simple, labeled graph with no ridiculous duplicates:

LOAD CSV WITH HEADERS FROM 'file:///location_dim.csv' AS row
WITH row.REGION_NM AS REGION_NM
MERGE (n: Region {reg: REGION_NM})
RETURN n

LOAD CSV WITH HEADERS FROM 'file:///divs.csv' AS node
WITH node.DIVISION_NM AS DIVISION_NM
MERGE (n:Division {div: DIVISION_NM})
RETURN n

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
MATCH (from:Region {reg: rels.REG}), (to:Division {div: rels.DIV})
MERGE (from)-[:hasDiv]->(to)
RETURN from, to
Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online