cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

"identical" queries, different results

pebbe
Node Link

I have a graph that I query in three different ways:

match (n:node{cat:'ssub'})-[:rel*]->(:word{pt:'ww'})-[:next]->(w2:word{pt:'ww'})-[:next]->(:word{pt:'ww'})<-[:rel*]-(n),
      (n)-[:rel*]->(w2)
return distinct n.sentid as sentid, n.id as id
order by sentid, id;

17 results

match (n:node{cat:'ssub'})-[:rel*]->(:word{pt:'ww'})-[:next]->(w2:word{pt:'ww'})-[:next]->(w3:word{pt:'ww'})
match (w3)<-[:rel*]-(n)-[:rel*]->(w2)
return distinct n.sentid as sentid, n.id as id
order by sentid, id;

378 results

match (n:node{cat:'ssub'})-[:rel*]->(:word{pt:'ww'})-[:next]->(w2:word{pt:'ww'})-[:next]->(w3:word{pt:'ww'})
match (n)-[:rel*]->(w2)
match (n)-[:rel*]->(w3)
return distinct n.sentid as sentid, n.id as id
order by sentid, id;

1281 results

I tested this in both neo4j and agensgraph, and got identical results.
My problem is, I expect the same results in all three cases, but only the last one gives all the correct matches. What am I missing?

3 REPLIES 3

pebbe
Node Link

I did some testing:

Connected to Neo4j 3.5.12 at bolt://localhost:7687.
Type :help for a list of available commands or :exit to exit the shell.
Note that Cypher queries must end with a semicolon.
neo4j> create (:test{id:1})-[:rel]->(:test{id:2})-[:rel]->(:test{id:3});
0 rows available after 48 ms, consumed after another 0 ms
Added 3 nodes, Created 2 relationships, Set 3 properties, Added 3 labels
neo4j> match (n:test{id:2}) create (n)-[:rel]->(:test{id:4});
0 rows available after 19 ms, consumed after another 0 ms
Added 1 nodes, Created 1 relationships, Set 1 properties, Added 1 labels
neo4j> match p = (n:test{id:1})-[:rel*]->() return p;
+-----------------------------------------------------------------+
| p                                                               |
+-----------------------------------------------------------------+
| (:test {id: 1})-[:rel]->(:test {id: 2})                         |
| (:test {id: 1})-[:rel]->(:test {id: 2})-[:rel]->(:test {id: 3}) |
| (:test {id: 1})-[:rel]->(:test {id: 2})-[:rel]->(:test {id: 4}) |
+-----------------------------------------------------------------+

3 rows available after 37 ms, consumed after another 2 ms
neo4j> match p = (:test{id:3})<-[:rel*]-(:test{id:1})-[:rel*]->(:test{id:4}) return p;
+---+
| p |
+---+
+---+

0 rows available after 65 ms, consumed after another 1 ms
neo4j> match p = ()<-[*]-(:test{id:1})-[*]->() return p;
+---+
| p |
+---+
+---+

0 rows available after 46 ms, consumed after another 1 ms
neo4j> match p = (n:test{id:1})-[:rel*]->(:test{id:4}) match q = (:test{id:3})<-[:rel*]-(n) return p, q;
+-----------------------------------------------------------------------------------------------------------------------------------+
| p                                                               | q                                                               |
+-----------------------------------------------------------------------------------------------------------------------------------+
| (:test {id: 1})-[:rel]->(:test {id: 2})-[:rel]->(:test {id: 4}) | (:test {id: 3})<-[:rel]-(:test {id: 2})<-[:rel]-(:test {id: 1}) |
+-----------------------------------------------------------------------------------------------------------------------------------+

1 row available after 52 ms, consumed after another 1 ms
neo4j> match (a:test)-[b]->(c:test) delete a, b, c;
0 rows available after 14 ms, consumed after another 0 ms
Deleted 4 nodes, Deleted 3 relationships

For the cases with zero results I would expect one result. Apparently, I was thinking wrong about Cypher. This...

(:A)<-[*]-(:X)-[*]->(:B)

... doesn't mean that there is a path from X to A, and a path from X to B. It means, there are two paths starting in X, one ending in A and one ending in B. One path starting from X, splitting up further down, with one branch ending in A and the other in B, does not match.

It's not identical

MATCH pattern1, pattern2

is one graph pattern

MATCH pattern1
MATCH pattern2

are two disconnected graph patterns, e.g. relationship uniqueness doesn't span different patterns.

To elaborate, within a path for a graph pattern, relationships must be unique, they cannot be reused.

The reason why this query returns no results:

match p = (:test{id:3})<-[:rel*]-(:test{id:1})-[:rel*]->(:test{id:4}) return p;

is that while there certainly is a path from test node 1 to test node 4, and while there is certainly a path from test node 1 to test node 3, there is no such pattern where the relationships are not reused. To reach either test node 3 or test node, 4, you have to traverse to test node 2 first. The relationship between test node 1 and test node 2 is used to find one of these two paths, but it cannot be reused in order to find the other path.

In order to find what you're looking for, you need to break up your patterns by having them in separate MATCH clauses:

match (t:test{id:1})
match p1 = (:test{id:3})<-[:rel*]-(t)
match p2 = (t)-[:rel*]->(:test{id:4}) 
return p1, p2;

This way, since your pattern is separated between the two MATCH clauses, relationship uniqueness doesn't apply, and you will get two separate paths (where the relationship between test 1 and test 2 is present in both).