Neo4j

lipchean · ‎02-07-2022

Hi,

The CSV shown above is the result of a manual manipulation from the original csv.

I then proceed to enter the following cypher.
LOAD CSV WITH HEADERS FROM 'file:///Data_3.csv' AS row
MERGE (m:Module {ID: row.ID, RefBy1:COALESCE(row.RefBy1,"NA") , RefBy2: COALESCE(row.RefBy2,"NA")})
return m

Followed by
MATCH (child: Module)
MATCH (parent: Module) where parent.ID = child.RefBy1
MERGE (child)-[:RefBy]->(parent)
RETURN child, parent

Followed by
MATCH (child: Module)
MATCH (parent: Module) where parent.ID = child.RefBy2
MERGE (child)-[:RefBy]->(parent)
RETURN child, parent

The original CSV file is

Do you have any suggestion how i can create a single script to create the same model i showed above. I have tried using UNWIND, as shown below but nothing gets created
LOAD CSV WITH HEADERS FROM 'file:///Data_1.csv' AS row
WITH row.ID AS impID, SPLIT(row.RefBy, ",") AS multiRef
UNWIND multiRef as impRef
MATCH (LI:LineItem {LIName: impID})
MATCH (RB:ReferenceBy {RBName:TRIM(impRef)})
MERGE (LI)-[rel:RefBy]->(RB)
return LI, RB

Any suggestions? Thanks in advance.

Cheers,
LC

glilienfield · ‎02-08-2022

Try this. It creates the nodes first, then processes the file again to create the relationships.

load CSV with headers FROM 'file:///Data_1.csv' AS row
merge(n:Module{ID:row.ID})
UNION ALL
load CSV with headers FROM 'file:///Data_1.csv' AS row
with row
where row.RefBy is not null
match(n:Module{ID:row.ID})
with n, split(row.RefBy, ",") as references
unwind references as reference
match(m:Module{ID:reference})
merge(n)-[:RefBy]->(m)

Test Data:

Result:

View solution in original post

lipchean · ‎02-08-2022

The model should look like this:

glilienfield · ‎02-08-2022

Try this. It creates the nodes first, then processes the file again to create the relationships.

load CSV with headers FROM 'file:///Data_1.csv' AS row
merge(n:Module{ID:row.ID})
UNION ALL
load CSV with headers FROM 'file:///Data_1.csv' AS row
with row
where row.RefBy is not null
match(n:Module{ID:row.ID})
with n, split(row.RefBy, ",") as references
unwind references as reference
match(m:Module{ID:reference})
merge(n)-[:RefBy]->(m)

Test Data:

Result:

lipchean · ‎02-08-2022

Hi @glilienfield

Your solution works. I do have a follow up question, i.e. why is UNION ALL required? Is there a way where both row and reference are created from a single LOAD?

Thanks

lipchean · ‎03-01-2022

hi @glilienfield

To make things even more complicated, how would you utilize the 'Value' column, such that the script will return the path with the largest aggregated value (should be A > C > D, because A = 2, C = 10, D = 3 for a total of 15)

Thanks,
LC

glilienfield · ‎02-09-2022

This seems to work too.

load CSV with headers FROM 'file:///Data_1.csv' AS row
merge(n:Module{ID:row.ID})
with n, row
call {
with n, row
with n, split(row.RefBy, ",") as references
unwind references as reference
merge(m:Module{ID:reference})
merge(n)-[:RefBy]->(m)
}

lipchean · ‎02-10-2022

Thanks @glilienfield
I'll try this too

glilienfield · ‎03-01-2022

First update the query to consume the value attribute. Store it as an attribute of the node.

load CSV with headers FROM 'file:///Data_1.csv' AS row
merge(n:Module{ID:row.ID, value:toInteger(row.value)})
with n, row
call {
with n, row
with n, split(row.RefBy, ",") as references
unwind references as reference
merge(m:Module{ID:reference})
merge(n)-[:RefBy]->(m)
}

You can then sum up the values along each path using the 'reduce' operation on the collection of nodes along each path, calculating the path's rank metric.

match p=(n:Module)-[:RefBy*]->(m:Module)
with nodes(p) as nodes
return [i in nodes | i.ID] as pathNodes, reduce(s=0, i in nodes | s + i.value) as rank
order by rank desc
limit 1

Note, the query does find intermediate paths, i.e. A->C. You can add a constraint to remove these if they are not needed and you only want full paths. I did not do so, since they would never have the largest rank since your values are all positive. Let me know if you may have negative values and want to eliminate these as a potential result.

You can run the queries separately or concatenate them to achieve the result in one operation.

lipchean · ‎03-01-2022

Thanks for your very fast response @glilienfield , i'll try it out.

Neo4j

Multiple Nodes in 1 column as target