Neo4j

hagen_ivar · ‎01-31-2019

Hi!

I'm working on a rule solving engine. The problem is that when I run it first the results needs to be fed into the engine again and potentially again and again. Is it possible to solve this with just one run?

This is my data.

CREATE (a:XTEST {id:["A"], type:"start"})
CREATE (b:XTEST {id:["B"], type:"end"})
CREATE (a)-[:XGIVE]->(b)
CREATE (c:XTEST {id:["C"], type:"start"})
CREATE (d:XTEST {id:["D"], type:"start"})
CREATE (e:XTEST {id:["E"], type:"end"})
CREATE (c)-[:XGIVE]->(d)-[:XGIVE]->(e)

CREATE (be:XTEST {id:["B","E"], type:"start"})
CREATE (f:XTEST {id:["F"], type:"end"})
CREATE (be)-[:XGIVE]->(f)

Looks like this

So when I run with A and C as input I will find nodes B, D and E.

WITH ["A","C"] AS input
MATCH p=(start:XTEST)-[r:XGIVE*]->(end:XTEST) 
WHERE ALL(p IN start.id WHERE p IN input)
RETURN p

But I also would like to get F as well since I now have found both B and E.
I can solve it by do the query again by now using A,B,C,D,E as input and do this until no new nodes are found. But it would be better to get all A,B,C,D,E,F with A,C as input with one query.

Anyone got any ideas on how to solve this?

Regards Ivar

paul_thomas · ‎01-31-2019

Note I changed some of the property and query aliases to make it less confusing for me.
Something like this is doing what you want maybe ...

CREATE (a:XTEST {pklist:["A"], nodetype:"start"})
CREATE (b:XTEST {pklist:["B"], nodetype:"end"})
CREATE (a)-[:XGIVE]->(b)

CREATE (c:XTEST {pklist:["C"], nodetype:"start"})
CREATE (d:XTEST {pklist:["D"], nodetype:"start"})
CREATE (e:XTEST {pklist:["E"], nodetype:"end"})
CREATE (c)-[:XGIVE]->(d)-[:XGIVE]->(e)

CREATE (be:XTEST {pklist:["B","E"], nodetype:"start"})
CREATE (f:XTEST {pklist:["F"], nodetype:"end"})
CREATE (be)-[:XGIVE]->(f)

WITH ["A","C"] AS input
MATCH pathx = (p1Start:XTEST)-[r:XGIVE*]->(p1End:XTEST)
WHERE ALL(n1 IN p1Start.pklist WHERE n1 IN input)
with collect(distinct p1End.pklist) as pEnds

unwind pEnds as p1
unwind pEnds as p2
with p1 + p2 as pNew
where p1 <> p2
MATCH pathy = (p2Start:XTEST)-[r:XGIVE*]->(p2End:XTEST)
WHERE p2Start.pklist = pNew
return p2End

hagen_ivar · ‎01-31-2019

Thanks for fast reply. This community is great.

Your solution works for the scenario I listed. But if I have further rule chain it will not solve that.

If I add this

CREATE (fg:XTEST {pklist:["F","G"], nodetype:"start"})
CREATE (h:XTEST {pklist:["H"], nodetype:"end"})
CREATE (fg)-[:XGIVE]->(h)

So with this as input

WITH ["A","C","G"] AS input

I would like H to be returned as well. The problem is that I will need to run the query as many times as the depth of dependencies. But this maybe is the only way to solve it?
I'm also thinking of creating a user defined function that can do this in a loop until all paths have been explored with the new results (endnodes) are collected while traversing.
Perfomance is also an important issue since I'll be going through tens of tousands of nodes.

paul_thomas · ‎01-31-2019

Hmm, maybe to avoid having the re-run queries you could change how your data is modeled and have the rules better connected via eg an extra relationship type which would link a pair of end nodes to the follow on start node?

Graphs can sometimes be a bit of a chicken v egg scenario where something you need to model the data while carefully thinking about the types of queries you want to run, rather than just thinking about the raw data ...

hagen_ivar · ‎01-31-2019

Thanks.

Yes I tried to think of this as a solution:

So having connections from B and E to [B,E] node and counting the number of incoming paths when running query. So if run with input A and C. The paths C-D-E-[B,E]-F and A-B-[B,E]-F with a filter that two paths needs to be present for returning F. But if only A as input then [B,E] node will only have one path through it which will not return F. Complicated and hard to implement maybe?

I will continue to redo the model maybe with other relationship types as you suggested.

paul_thomas · ‎02-01-2019

Keep in mind, matches like this could be very useful for you ...

WITH ["A","C"] AS input
MATCH
(p1Start:XTEST)-[r:XGIVE*]->(p1End:XTEST)-->(pFork:XTEST)<--(p2End:XTEST)<-[r:XGIVE*]-(p2Start:XTEST),
(pFork)-[r:XGIVE*]->(p3End:XTEST)
WHERE
...

Neo4j

Path conundrum for rule solving engine