Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-17-2021 09:11 PM
I need some help in importing a CSV file and matching the contents of one column to existing nodes in my graph database. The CSV currently looks like this:
ContentName, URL, Creator, Tags, Comments
All of these columns contain strings. However, the "Tags" column contains individual words separated by commas, like so: Word1, Word2, Word3, Word4.
These words all have a corresponding node that is already in the database. I want to separate those Words from one another. Then I need to match, for example, Word1 in the CSV to Word1 as a node in the database and create a relationship. Is there a way to do this?
Solved! Go to Solution.
02-18-2021 09:05 PM
Try this:
//in .csv change this to:
//with row.Tags as tags.........
with "Basics, Sketching, Perspective" as tags
with split(tags, ",") as t1
// Change the appropriate node label.....
match (t:Tag)
where t.tag in t1
// creating a Contentname node (row 2).....I didn't add all the properties for this test......
merge (c:ContentName {name: "Drawing Fundamentals", creator: "Will Kemp", type: "Course"})
//code to create relationships between (c) and 'Tag' nodes......
merge (c)-[:hasTag]->(t)
return c, t
Result:
02-17-2021 10:49 PM
Assuming this:
(a:Tags {tags: "Word1, Word2, Word3, Word4"}) and
(b:Word {word:"Word2"})
match (a:Tags)
with split(a.tags, ",") as t1
//test
return t1[0], t1[1], t1[2], t1[3]
Result: Word1, Word2, Word3, Word4
match (b:Word)
where b.word = t1[1]
From here you continue further.
02-18-2021 11:04 AM
Thanks for responding. So I'm seeing that you're extracting the whole CSV string (Word1, Word2, etc.) into a node first, splitting its contents into an array, and then matching them to existing nodes with
match (b:Word)
where b.word = t1[1]
But this part of the code doesn't work, namely because I'm trying to transform each CSV row into its own node (with ContentName, URL, Creator, Comments) with labels. Then I want to connect this node to existing Word1/Word2/Word3 nodes in the database. This chart should make it clearer:
02-18-2021 05:37 PM
Post couple of data lines from your .csv file including the headers.
02-18-2021 06:28 PM
Is it fair to assume that you are using .csv file to create 'Content' nodes.? You show 'Tags', but is not clear how you are using 'Tags'.
02-18-2021 07:22 PM
Sorry for getting back late! Here's a few example data lines from the CSV file:
02-18-2021 07:36 PM
Thanks for the info. It's very helpful for me to design the code.
02-18-2021 09:05 PM
Try this:
//in .csv change this to:
//with row.Tags as tags.........
with "Basics, Sketching, Perspective" as tags
with split(tags, ",") as t1
// Change the appropriate node label.....
match (t:Tag)
where t.tag in t1
// creating a Contentname node (row 2).....I didn't add all the properties for this test......
merge (c:ContentName {name: "Drawing Fundamentals", creator: "Will Kemp", type: "Course"})
//code to create relationships between (c) and 'Tag' nodes......
merge (c)-[:hasTag]->(t)
return c, t
Result:
02-18-2021 09:32 PM
Thank you for taking the time to figure this out
But if I have thousands of rows in the CSV, is there a way to reference the contents of the CSV "Tags" column without having to type out what's inside? The same goes for the other columns.
02-18-2021 09:38 PM
I already mentioned in the comment to replace my WITH statements:
Replace my first WITH statement:
my sample:
with "Basics, Sketching, Perspective" as tags
//in .csv change this to:
with row.tags as tags
That's all the change!
02-18-2021 10:42 PM
It works now! Thank you so much for your expertise.
02-18-2021 10:51 PM
Glad to be of some help!
02-26-2021 06:19 PM
Got a new problem. For some reason my code is matching tags inconsistently with my existing nodes. For example, Drawing Fundamentals has the tags "Basics", "Sketching" and "Perspective". But as you can see from the chart, those tags have not been matched. Here is the code I adapted:
load csv with headers from "https://docs.google.com/spreadsheets/d/15Xvz8JgG_AztY6Ww18EuO5weoVDK4mD8bqGR0UlmGRM/export?format=csv" as csv
with csv.Tags as tags
with split(tags, ", ") as t1
match (n:skos__Concept)
where n.skos__prefLabel in t1
load csv with headers from "https://docs.google.com/spreadsheets/d/15Xvz8JgG_AztY6Ww18EuO5weoVDK4mD8bqGR0UlmGRM/export?format=csv" as csv
merge (c:ContentName {name: csv.ContentName, creator: csv.Creator, URL: csv.URL, Type: csv.Type})
merge (c)-[:hasTag]->(n)
return c, n
It produces this in Neo4j. As you can see, Drawing Fundamentals has not been matched to "Basics", "Sketching" and "Perspective".
All the sessions of the conference are now available online