Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-03-2020 06:37 PM
I have a bunch of connected activities (nodes) in CSV format, one file containing the activities, and another the connections. I have no problem creating the nodes, but I just cannot get the links created without duplicates.
Activities/Nodes
ResourceName Min Mode Max
Case_Start 0 0.3 9.8
Create_Delivery 0 0 0
Create_Quotation 0 0 0
Create_Sales_Order_Item 0 0 0
Cypher to create the nodes:
LOAD CSV WITH HEADERS FROM "file:///ACTIVITIES_O2C.csv" AS row
CREATE (a:Activity {Name:row.ResourceName, Min: toFloat(row.Min), Mode: toFloat(row.Mode), Max: toFloat(row.Max), Cost: toFloat(row.CostRate)})
Connections/Links
ConnectorName StartingActivity EndingActivity LinkProbability Min Mode Max
Case_Start::Create_Sales_Order_Item Case_Start Create_Sales_Order_Item 70.38 0.00 0.00 0.00
Case_Start::Create_Delivery Case_Start Create_Delivery 24.77 0.00 0.00 0.00
Case_Start::Create_Quotation Case_Start Create_Quotation 4.84 0.00 0.00 0.00
Cypher to create the links:
LOAD CSV WITH HEADERS FROM "file:///CONNECTIONS_O2C.csv" AS row
MATCH (lft { Name: row.StartingActivity })
MATCH (rgt { Name: row.EndingActivity })
MERGE (lft)-[:FEEDS { Likelihood: toFloat(row.LinkProbability), Min: toFloat(row.Min), Mode: toFloat(row.Mode), Max: toFloat(row.Max) }]->(rgt)
The issue is that I get links created between all the nodes and even some circular links.
I know I'm doing something incorrectly. Just need help with the correct Cypher.
01-03-2020 07:33 PM
Try this for creating the links:
LOAD CSV WITH HEADERS FROM "file:///CONNECTIONS_O2C.csv" AS row
MATCH (lft { Name: row.StartingActivity })
MATCH (rgt { Name: row.EndingActivity })
MERGE (lft)-[r:FEEDS]->(rgt)
SET r.Likelihood = toFloat(row.LinkProbability),
r.Min = toFloat(row.Min),
r.Mode = toFloat(row.Mode),
r.Max = toFloat(row.Max)
The MERGE clause takes the whole statement into account to determine if a match is made. I suspect that as your data loads, there are multiple entries in your connection file that would map a start node to an end node. In the cypher I wrote, it would look up to see if there's already a relationship between the two nodes and if there is, it's going to update the relationship instead of creating a second relationship.
As far as the circular paths, I would validate the source data again. The fact that you're getting multiple relationships between nodes and circular paths, I would double check how the CSVs are being generated.
01-03-2020 08:19 PM
Thanks Mike, but that doesn't solve the problem. And, no, I do not have duplicates in the data file. If after loading the data I run the Cypher statement
MATCH p=(:Activity [Name: 'Case_Start'])-[r:FEEDS]->() RETURN p
I get the following result
01-03-2020 08:32 PM
Actually, turns out I was wrong all this time. The graph created is perfectly correct. Thanks for the challenge Mike.
All the sessions of the conference are now available online