Neo4j

saminahbab0 · ‎08-10-2020

I have a part of a query query:

 OPTIONAL MATCH (chunk: CHUNK) 
WHERE (chunk)-[:CONTAINS_TOKEN]->(token1) AND 
(chunk)-[:CONTAINS_TOKEN]->(token2)

above this specified two relations to the chunk of CONTAINS_TOKEN

often the chunk node returned is has more than two relations; it is related to token1, token2, and also other tokens.

How can I specify that the chunk only has CONTAINS_TOKEN relations to the nodes specified and no other nodes?

tony_chiboucas · ‎08-11-2020

Let's simplify the Cypher a bit first...

MATCH (token1)<-[:CONTAINS_TOKEN]-(chunk: CHUNK)-[:CONTAINS_TOKEN]->(token2)

... that'll do the same thing, though it may be less efficient if you have MANY chunks to one token, but it's not quite what we need. The easiest thing to do is count the rels. We want 2 and only 2.

MATCH (chunk: CHUNK)-[r:CONTAINS_TOKEN]->()  
WHERE count(r) = 2 
   AND (chunk)-[:CONTAINS_TOKEN]->(token1)
   AND (chunk)-[:CONTAINS_TOKEN]->(token2)

...that'll do it, depending on the rest of your script.

Why are you doing an optional match?

andrew_bowman · ‎08-24-2020

Provided that you have the tokens in a list, you can use pattern comprehensions to ensure that a chunk only has relationships to the given tokens, and no others.

... // assume we've already matched to and collected tokens into a `tokens` variable
WITH tokens, head(tokens) as first
MATCH (chunk:CHUNK)-[:CONTAINS_TOKEN]-(first)
WITH tokens, [(chunk)-[:CONTAINS_TOKEN]-(token) | token] as otherTokens
WHERE size(tokens) = size(otherTokens) AND all(token IN tokens WHERE token IN otherTokens)
...

So with this, we find :CHUNK nodes not by doing a label scan, but a traversal from one of the tokens we already have (so we know at least this one is connected). Then we use a pattern comprehension to gather all the contained tokens for the chunk (we're assuming there is at most one relationship to a given token from a chunk). Then we need to make sure the lists contain the same elements, so we ensure the sizes are the same, and that the tokens in one collection are all in the other.

If you have APOC Procedures installed, then you can replace those predicates with WHERE apoc.coll.isEqualCollection(tokens, otherTokens).

saminahbab0 · ‎11-25-2020

Hey @andrew.bowman !
Thanks for replying, I tried out your solution and it worked for lists where there are tokens.

I have an issue when there are two types of nodes that we have to consider for a chunk. I was wondering if you could advise me on the best way forward ?

This is the query that I tried as an extention to your aproach:

UNWIND [] as t
MERGE (tok: TOKEN {text: t.text, type: t.type})
with collect(tok) as tokens

UNWIND [{text: "12", type: "12"}] as e
MERGE (entity: ENTITY {text: e.text , type: e.type})
with tokens, collect(entity) as entities

with tokens + entities as tokens
with tokens , head(tokens) as first

OPTIONAL MATCH (chunk: CHUNK)-[:CONTAINS_TOKEN|CONTAINS_ENTITY]->(first)
WITH tokens, [(chunk)-[:CONTAINS_TOKEN|CONTAINS_ENTITY]-(token) | token] as otherTokens, chunk
WHERE size(tokens) = size(otherTokens) AND all(token IN tokens WHERE token IN otherTokens)
return CASE WHEN chunk is null then true else false end

This does not return a boolean but nothing instead. When both of the lists are non empty, the query returns as expected a boolean.

I did this because I thought that in production one of the lists that I unwind could be empty; it is only certain that at least one will be nonempty (tokens).

a potential would be to have two separate queries, but I thought I would ask you to see what you thought

thanks for the help!

Neo4j

Disjunction on Match (match only on patterns specified)