cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

How to load CSV matrix and iterate through columns to import relationships?

I want to import a CSV matrix with a header line and NxN cells to import including the specific relationship type which looks like the following:

id, name, element1, element2,..., elementX
1, element1,,1;2,2;5,,1;2,...
2, element2,3,,4;5,...
...
N, elementX

Using the cypher code below only imports one column, how do I iterate through each column?

LOAD CSV WITH HEADERS FROM 'file:///test.csv' AS line WITH line
WHERE line.element1 IS NOT NULL
Match (c:element{id: toInteger(line.id),name: line.name}) 
Match (d:element{name:'element1'})
Merge (d)-[:REL{type: line.element1}]-(c)
return c,d
1 ACCEPTED SOLUTION

Run a PROFILE of your query (with some LIMIT for your row) and see where the number of rows during processing goes to 0. That should be an indicator of what's failing (likely a MATCH to existing elements, especially if the existing elements had extra spaces you didn't clean up).

You can check the parsing of the keys at least with this:

LOAD CSV WITH HEADERS FROM 'file:///test.csv' AS line
WITH line, [key in keys(line) WHERE key STARTS WITH 'element' AND line[key] IS NOT NULL] as keys
RETURN keys 
LIMIT 10

This should show you the list of columns/keys per row with a corresponding non-null value for the row. If you see these, then more than likely the MATCHes are the problem, and you'll want to reexamine your assumptions about them and do your own investigation into what's going on with the data for those nodes, to make sure it's what you expect.

View solution in original post

7 REPLIES 7

In the future, you probably should avoid a variable number of header columns and instead stick to a more rigid format. If you have to do the same things on all of them, just use two elements per row, so each pairing would have its own row in the CSV.

That said, you CAN process this, though it is a bit ugly. Something like this:

LOAD CSV WITH HEADERS FROM 'file:///test.csv' AS line
WITH line, [key in keys(line) WHERE key STARTS WITH 'element' AND line[key] IS NOT NULL] as keys
Match (c:element{id: toInteger(line.id),name: line.name}) 
Match (d:element) // d will address all elements associated with element columns for which there is an entry per row
WHERE d.name in keys
Merge (d)-[:REL{type: line[d.name]}]-(c)

Thanks for your reply, somehow I end up with no changes / records. I get your point with the header.

More than likely this is because your headers aren't just separated with commas, they're separated with commas followed by a space, so that space gets included in the header name, so you have "id", " name", " element1" etc. That would cause failures on some matches, but especially on the STARTS WITH part, since it doesn't include the leading space.

Fix up your headers and your data, those spaces are going to be problematic, not just in your headers but in your row data too.

There are no more spaces, but still no changes & records. I tried several approaches and only the line-by-line manual load CSV works, and not the iteration through all columns. Any idea?

id,name,element1,element2,element3,element4,element5,...
1,element1,,M,,,,,,,,...
2,element2,M,,M,,M,,,,...
3,element3,,M,,,M,,,,,,,...
4,element4,,,,,M,,,,,,...
5,element5,,M,M,M,,...
...

Thanks

Run a PROFILE of your query (with some LIMIT for your row) and see where the number of rows during processing goes to 0. That should be an indicator of what's failing (likely a MATCH to existing elements, especially if the existing elements had extra spaces you didn't clean up).

You can check the parsing of the keys at least with this:

LOAD CSV WITH HEADERS FROM 'file:///test.csv' AS line
WITH line, [key in keys(line) WHERE key STARTS WITH 'element' AND line[key] IS NOT NULL] as keys
RETURN keys 
LIMIT 10

This should show you the list of columns/keys per row with a corresponding non-null value for the row. If you see these, then more than likely the MATCHes are the problem, and you'll want to reexamine your assumptions about them and do your own investigation into what's going on with the data for those nodes, to make sure it's what you expect.

Thanks for your input, I finally managed to import all relations. There was a mismatch in the match clause, I had to create the elements first respectively.

Hey there, any chance you could post the code you finally landed on? Attempting to do the same type of work (matching elements on both axis and relationships down rows and columns.)

I'd appreciate it!