Neo4j

bill_dickenson · ‎11-07-2020

This problem is defeating me but it seems so simple that i must be looking at this backwards.

Graph Now: Drivers start at the top and drive down a directed graph [x:DRIVE] from node to node. Along that route they have two types of stops, "delivery" and "Rest". Routes can be bidirectional ( not really important to this ) so when I look at the route (Graph Now), I get a fairly simple graph. All good so far.

But one of the analytics wants a graph like End State where they only want to see the network of nodes that are "Rest" stops. [x:REST]. Rather than do this on the fly ( the actual app is big) I want to build a separate set of relationships for REST rather than calculate.

But I cannot figure out how to build the one in the middle.

I can't connect all the rest stops or I lose my flow. I can't connect all of the rest stops because that won't follow the route. I have to follow the "path" of DRIVE . So in essense, I want a simplified version of DRIVE that only has Rest Stops, and I want to create a relationship between them as an overlay.

andrew_bowman · ‎11-12-2020

Yes, that means we can just use WHERE start:Rest as the filter for the MATCH.

View solution in original post

bill_dickenson · ‎11-09-2020

merge (a:Delivery {inode:0,type:'Start',load:'Truck',compileunit:'route'});
merge (a:Delivery {inode:1, type:'Delivery', load:'pallet', compileunit:'route'});
merge (a:Delivery {inode:2, type:'Delivery' , load:'pallet', compileunit:'route'});
merge (a:Delivery {inode:3, type:'Delivery' , load:'pallet', compileunit:'route'});
match (a:start {inode:0}), (b:Delivery {inode:1}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:1}), (b:Delivery {inode:2}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:1}), (b:Delivery {inode:3}) merge (a)-[x:drives]-(b);
merge (a:Delivery {inode:4, type:'Rest' , load:'coffee', compileunit:'route'});
match (a:Delivery {inode:2}), (b:Delivery {inode:4}) merge (a)-[x:drives]-(b);
merge (a:Delivery {inode:5, type:'Delivery', load:'pallet', compileunit:'route'});
merge (a:Delivery {inode:6, type:'Delivery' , load:'pallet', compileunit:'route'});
merge (a:Delivery {inode:7, type:'Rest', load:'coffee', compileunit:'route'});
merge (a:Delivery {inode:8, type:'Rest' , load:'lunch', compileunit:'route'});
match (a:Delivery {inode:4}), (b:Delivery {inode:5}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:4}), (b:Delivery {inode:6}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:6}), (b:Delivery {inode:7}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:6}), (b:Delivery {inode:8}) merge (a)-[x:drives]-(b);
merge (a:Delivery {inode:9, type:'Rest' , load:'pallet', compileunit:'route'});
match (a:Delivery {inode:3}), (b:Delivery {inode:9}) merge (a)-[x:drives]-(b);

ameyasoft · ‎11-10-2020

I used your Cypher script and added the [:rest] relationships with the aim to get End State

merge (a1:Delivery {inode:0,type:'Start',load:'Truck',compileunit:'route'})
merge (a2:Delivery {inode:1, type:'Delivery', load:'pallet', compileunit:'route'})
merge (a3:Delivery {inode:2, type:'Delivery' , load:'pallet', compileunit:'route'})
merge (a4:Delivery {inode:3, type:'Delivery' , load:'pallet', compileunit:'route'})

merge (a1)-[:drives]->(a2)
merge (a2)-[:drives]->(a3)
merge (a2)-[:drives]->(a4)

merge (a5:Delivery {inode:4, type:'Rest' , load:'coffee', compileunit:'route'})
merge (a3)-[:drives]->(a5)
merge (a1)-[:rest]->(a5)

merge (a6:Delivery {inode:5, type:'Delivery', load:'pallet', compileunit:'route'})
merge (a7:Delivery {inode:6, type:'Delivery' , load:'pallet', compileunit:'route'})
merge (a8:Delivery {inode:7, type:'Rest', load:'coffee', compileunit:'route'})
merge (a9:Delivery {inode:8, type:'Rest' , load:'lunch', compileunit:'route'})

merge (a5)-[:drives]-(a6)
merge (a5)-[:drives]-(a7)
merge (a7)-[:drives]-(a8)
merge (a8)-[:drives]-(a9)
merge (a5)-[:rest]->(a8)
merge (a5)-[:rest]->(a9)

merge (a10:Delivery {inode:9, type:'Rest' , load:'pallet', compileunit:'route'})
merge (a4)-[:drives]-(a10)
merge (a1)-[:rest]->(a10)

Added 10 labels, created 10 nodes, set 40 properties, created 13 relationships, completed after 175 ms

Result:

match (a:Delivery) where a.type = 'Start'
match (b:Delivery) where b.type = 'Rest'
match (a)-[:rest]->(b)
optional match (b)-[:rest]->(c) where c.type = 'Rest'
return a, b, c

Result:

The above result shows your 'not valid' (shown in red). This is by Neo4j architecture we cannot avoid this!

One solution I can think of is using virtual nodes. Here is the Cypher script:

match (a:Delivery) where a.type = 'Start'
match (b:Delivery) where b.type = 'Rest'
match (a)-[:rest]->(b)
with a, b, collect(b) as b1
optional match (b)-[:rest]->(c)
with a, b1, collect(c) + b1 as c1
unwind c1 as c2
with apoc.create.virtual.fromNode(a, ['type']) as d1, c2
WITH d1, c2.type as t2, head(labels(c2)) AS l2, 'rest' AS rel_type
CALL apoc.create.vNode([l2],{name:l2, type:t2}) yield node as g
CALL apoc.create.vRelationship(d1,rel_type,{},g) yield rel
RETURN *;

Result:

bill_dickenson · ‎11-11-2020

Interesting. Well at least I wasn't completely stupid. This was very helpful. One step may still be an issue.

Talking to them a few minutes ago, we agree that while the one at the bottom is close, this one

is enough. That extra Drives between 8 and 7 is annoying but the representation is much closer to what we needed. So thank you, thats close enough for that. As long as 0 didnt connect directly to 7 or 8, thats perfect.

But you also did something i can't do. I cannot define those relationships from the data as its coming in. I have to build the REST connections after all the data is in.

so this line - merge (a1)-[:rest]->(a5)

I can't do that manually. That has to be constructed from the data. So picture this from the standpoint of not having these lines.

merge (a1)-[:rest]->(a5)
merge (a5)-[:rest]->(a8)
merge (a5)-[:rest]->(a9)
merge (a1)-[:rest]->(a10)

So step 1 is to get those built, step 2 is what you have nicely laid out

Any suggestion on how to build those 4 statements/connections from the data I sent ?

ameyasoft · ‎11-11-2020

Using your Cypher scripts, I recreated the scenario without adding the :rest relationships.

Here is my shot at creating the :rest relationships with this small set of data. 
I did this in two steps.

Step: 1
match (c:Delivery), (d:Delivery)
match (c)-[*..3]-(d)
where c.type = 'Start' and d.type = 'Rest'
with c, d
merge (c)-[:rest]->(d)
return c, d

Step: 2

match (c:Delivery) where c.type = 'Start'
match(c)-[:rest]-(d)-[]-(e)-[]-(f)-[]-(g)
where d.type = 'Rest' and e.type = 'Delivery' and f.type = 'Rest' and g.type = 'Rest'
with c, d, e, f, g
merge (d)-[:rest]->(f)
merge (d)-[:rest]->(g)
return c, d, e, f, g

Finally you get your result.

bill_dickenson · ‎11-11-2020

Thank you ! This should be enough. Another great solution. Thanks again

andrew_bowman · ‎11-11-2020

APOC Procedures can help here, but you'll need to first alter the graph so that nodes of type 'Rest' get an additional :Rest label:

CALL apoc.periodic.iterate("MATCH (d:Delivery) WHERE d.type = 'Rest' RETURN d", "SET d:Rest", {}) YIELD batches, total, errorMessages
RETURN batches, total, errorMessages

Now that we have that, we can make use of APOC path finding procedures, which can let us specify a traversal pattern where we can stop expansion at the first encountered node of a given label. Basically from each :Rest node (as well as the starting node, which has no incoming :drive relationships), we'll traverse outgoing :drive relationships and only return the first :Rest node encountered per path (it prevents traversal past :Rest nodes), then we create our :rest relationship between them:

MATCH (start:Delivery)
WHERE start:Rest OR NOT ()-[:drives]->(start)
CALL apoc.path.subgraphNodes(start, {relationshipFilter:'drives>', labelFilter:'/Rest'}) YIELD node as restStop
CREATE (start)-[:rest]->(restStop)

In the labelFilter portion, prefixing a label name with / means it's a termination filter, so expansion will stop at :Rest nodes (expanding no further) and return those nodes. This prevents us from accidentally creating :rest relationships that bypass :Rest nodes.

bill_dickenson · ‎11-11-2020

Ahhhhhhhh that makes sense. Thank you both for some excellent answers.

bill_dickenson · ‎11-12-2020

Both solutions work - But if you will forgive a followup question.

We have no problem changing the "start" node to have a :Rest label. Does that simplify the query any ?

andrew_bowman · ‎11-12-2020

Yes, that means we can just use WHERE start:Rest as the filter for the MATCH.

bill_dickenson · ‎11-12-2020

Makes sense. Its all working now. Its not quick but fast enough. 5m nodes is not trivial. Made the change to the root node because I always know thats in the chain. Thank you both !

Neo4j

How do I create a subnetwork from a large network?