cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Neo4j Cypher to Yield groups of nodes based on CSV column value

yamini_n
Node Link

I am new to neo4j and as an extension to cypher as well. I am working with Neo4j Enterprise 4.4.5 version. I am trying to import a csv into neo4j and generate the nodes and relationships. I have a CSV with column headers Class, Level, Title and Relationship. Each row of the csv provides information of any individual node which I can use to create my nodes. The various relationships are, say, A, B, C and D. I would like to create groups of nodes, based on Relationship column value. As a result, I would yield 4 groups, set_A, set_B, set_C, set_D. Additionally, my node label would be dynamic and has multilabels. eg: [Class, Level]; I used apoc.merge.node() for this purpose.

The following is the query I wrote:

LOAD CSV WITH HEADERS FROM "file:///ABCD.csv" AS IsKg
WITH IsKg

WITH * WHERE (IsKg.Class <> "" OR IsKg.Level <> "") AND IsKg.Relationship = "A"
CALL apoc.merge.node(
    [IsKg.Class, IsKg.Level],
    {title:coalesce(IsKg.Title,"Unknown")}
)
YIELD node as set_A

WITH * WHERE (IsKg.Class <> "" OR IsKg.Level <> "") AND IsKg.Relationship = "B"
CALL apoc.merge.node(
    [IsKg.Class, IsKg.Level],
    {title:coalesce(IsKg.Title,"Unknown")}
)
YIELD node as set_B
REMOVE set_B.noOp

The above query only created set_A nodes, not set_B.

The above query works fine, without the IsKg.Relationship conditions. From this I would deduce that, when I include an extra condition based on relationship, then the entire IsKg is not available to yield next set of nodes. Hence, set_B is not created. I would require to create 4 such groups. The purpose for creating my groups of nodes is to create relationships between them based on further criteria from another column called Hierarchy.

Any guidance on how I could make the above query work is much appreciated.

Thank you

1 ACCEPTED SOLUTION

I have found another way of implementing what I require.

LOAD CSV WITH HEADERS FROM "file:///ABCD.csv" AS IsKg
WITH IsKg

WITH * WHERE (IsKg.Class <> "" OR IsKg.Level <> "") 
CALL apoc.merge.node(
    [IsKg.Class, IsKg.Level],
    {title:coalesce(IsKg.Title,"Unknown")}
)
YIELD node 
RETURN IsKg.Relationship AS IsKgRel, COLLECT(node) AS nodes

For each value of IsKg.Relationship, it owuld return a collection of nodes.

View solution in original post

3 REPLIES 3

You are correct, the IsKg.Relationship = ‘A’ is filtering out all the ‘B’ records, so this rows are not available later in the query. You could fix this by wrapping each code block in a ‘call subquery’, but you will not be able to return anything from each, as you will have the similar issues

another approach is to use apoc.do commands to conditionally create the nodes. 

https://neo4j.com/labs/apoc/4.1/overview/apoc.do/

Thank you for your reply. I will try this and get back to you.

I have found another way of implementing what I require.

LOAD CSV WITH HEADERS FROM "file:///ABCD.csv" AS IsKg
WITH IsKg

WITH * WHERE (IsKg.Class <> "" OR IsKg.Level <> "") 
CALL apoc.merge.node(
    [IsKg.Class, IsKg.Level],
    {title:coalesce(IsKg.Title,"Unknown")}
)
YIELD node 
RETURN IsKg.Relationship AS IsKgRel, COLLECT(node) AS nodes

For each value of IsKg.Relationship, it owuld return a collection of nodes.