Neo4j

dvoneschwege · ‎03-05-2022

I need to iteratively create relationships between nodes based on the nodes' properties, which change during the loop. I am trying to go about it like this:

MATCH (A), (B)
    WHERE NOT (A)-[:BOND]-(B)
    AND A<>B
FOREACH (a in A |
    FOREACH (b in B |
        **IF a.slots > 0 AND b.slots > 0 AND a.x + b.x = y** <<< dont know how to do this
            CREATE (a)-[:BOND]->(b)
            SET a.slots = a.slots - 1
            SET b.slots = b.slots - 1

In other words, create a relationship if both nodes have 'slots' available, and some other condition is met (a.x + b.x = y).

The aspect I am struggling with is that the creation of relationships between nodes later in the loop on are influenced by those created earlier in the loop, which is why I am trying to use FOREACH, otherwise I would just match & create relationships once-off.

Thanks in advance for the help!

dvoneschwege · ‎03-07-2022

These work great. For my specific use-case I therefore summarise -

CALL apoc.periodic.commit(
    "
    MATCH (a)
        WHERE a.slots > 0
    WITH a LIMIT 1
        MATCH (b)
            WHERE b.slots > 0
            AND NOT EXISTS ( (a)-[:BOND]-(b) )
            AND id(a) <> id(b)
    WITH a, collect(b)[0..a.slots] as b_nodes
        UNWIND b_nodes as b
            CREATE (a)-[:BOND]->(b)
            SET a.slots = a.slots - 1
            SET b.slots = b.slots - 1
    RETURN count(*)
    "
);

View solution in original post

glilienfield · ‎03-05-2022

Try putting a WITH clause after the second for each clause, passing a and b. Then follow with a WHERE clause with your condition.

What is ‘y’ in your expression.

glilienfield · ‎03-05-2022

As a note, you should not need the forEach clauses. The results of your match statement will be rows of each combination of a and b. The following should work. What is the value of 'y' in your expression? Also, do want a relationships created in both directions, i.e. a->b and b->a? If so, exchange the condition I added that id(a)<id(b) and replace with your original condition of a<>b, or change mine to id(a)<>id(b). This condition ensures the result set will not have (a,b) and (b,a) for specific nodes a and b.

Maybe my lack of understanding of your 'y' variable's role has changed your original intent.

MATCH (a)
MATCH (b)
WHERE NOT EXISTS ( (a)-[:BOND]-(b) )
AND id(a) < id(b)
AND a.slots > 0 AND b.slots > 0 AND a.x + b.x = y**
CREATE (a)-[:BOND]->(b)
SET a.slots = a.slots - 1
SET b.slots = b.slots - 1

dvoneschwege · ‎03-05-2022

Hi Gary,

thank you so much for the quick reply!

Firstly, I realized that the condition of a.x + b.x = y is actually unnecessary with regards to explaining my problem, so ignore that rather.

Secondly, the problem remains - I need to create relationships between nodes a and b as long as they have 'slots' available for connecting a relationship. These slots get used up as relationships are added, and once a node has zero slots available, no additional relationships may be added.

However, running the code as you suggested results in slots being decremented far beyond 0, the result being that every node gets connected to every other node.

I hope my explaination makes sense, and thanks once again for the help.

glilienfield · ‎03-05-2022

I assume the a and b nodes exist since you are using match up front. As such, you should use merge instead of create for making the relationship. The create will keep creating new nodes. I missed that too.

glilienfield · ‎03-05-2022

That wasn't it. I am also getting nodes with negative slots values after executing the query.

dvoneschwege · ‎03-06-2022

I agree - the match returns a variable of rows of nodes which naturally isn't being updated as the properties change - it might therefore be necessary to match after every time a property changes. However, this seems extremely inefficient, hence my question.

glilienfield · ‎03-06-2022

I looked into the behavior more. What looks to be happening is that all the matches are done up front, then each pair of nodes matched is processed, so the slots>0 conditions are no longer evaluated passed the initial matches. I think this is demonstrated if you look at the query plan, it shows an eager operator after the match and before the processing.

I tried your idea of a foreach loop, but cypher didn't allow a match in a foreach loop. I then tried a call instead of a foreach loop. The results were better, as only the first few nodes got 'over slotted', but that still doesn't work.

I am thinking this is requirement is outside of cypher's capabilities. This is an iterative algorithm where you need to check your conditions each iteration to determine if you are done or not. This would be a good application of using your own procedure.

I would be interested in knowing if you solve this.

dvoneschwege · ‎03-06-2022

Hi Gary, thanks so much for all your effort - that makes sense. I think I might use the python API here, to handle all of the iterative logic externally in python, using cypher only to query.

Thanks again, I learned quite a bit in this process.

Regards,
Daniel

bennu_neo · ‎03-07-2022

Hi @dvoneschwege !

As @glilienfield said, this kind of logic it can be better handled on a different layer (application) so you can get much more flexible with your logic. Anyway, can you try something like:

CALL apoc.periodic.commit(
  "MATCH (a), (b)
    WHERE NOT EXISTS ( (a)-[:BOND]-(b) )
    AND a.slots > 0 AND b.slots > 0 AND a.x + b.x = y**
    WITH a, b limit $limit
    CREATE (a)-[:BOND]->(b)
    SET a.slots = a.slots - 1
    SET b.slots = b.slots - 1
    RETURN count(*)",
  {limit:1});

Bennu

Oh, y’all wanted a twist, ey?

dvoneschwege · ‎03-07-2022

Hi Bennu, thanks so much, you guys are great.

Regards,
Daniel

glilienfield · ‎03-07-2022

@bennu.neo this is a super cool idea. We can use this to implement iterative algorithms, as long as the conditions cause the result set to converge to zero.

I think you can refactor the query based on this to be more efficient. In the above, the Cartesian product is computed, then all but one ordered pair is eliminated. this product is computed each iteration.

In the below, for each iteration, we find one node 'a' that has available slots. We then find all candidate 'b' nodes and filter out all but 'a.slots' number of them, since this is the number of available slots to relate a 'b' node to the given 'a' node. I tried using the LIMIT clause instead of the collect/unwind approach, but cypher will to allow a LIMIT based on a variable, such as a.slots.

This run noticeably faster.

CALL apoc.periodic.commit(
" MATCH (a:Test where a.slots > 0)
WITH a
LIMIT 1
MATCH (b:Test where b.slots > 0)
WHERE a<>b
and NOT EXISTS ( (a)-[:BOND]-(b) )
WITH a, collect(b)[0..a.slots] as b_nodes
UNWIND b_nodes as b
CREATE (a)-[:BOND]->(b)
SET a.slots = a.slots - 1
SET b.slots = b.slots - 1
RETURN count(*)"
);

dvoneschwege · ‎03-07-2022

These work great. For my specific use-case I therefore summarise -

CALL apoc.periodic.commit(
    "
    MATCH (a)
        WHERE a.slots > 0
    WITH a LIMIT 1
        MATCH (b)
            WHERE b.slots > 0
            AND NOT EXISTS ( (a)-[:BOND]-(b) )
            AND id(a) <> id(b)
    WITH a, collect(b)[0..a.slots] as b_nodes
        UNWIND b_nodes as b
            CREATE (a)-[:BOND]->(b)
            SET a.slots = a.slots - 1
            SET b.slots = b.slots - 1
    RETURN count(*)
    "
);

Neo4j

Conditional iterative creation of relationships