Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-22-2020 12:59 PM
H,
I have test graph
My query is:
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
OPTIONAL MATCH (n2)-[:PURCHASED]->(p2:product)
WHERE p <> p2
RETURN COLLECT(p2) AS otherProducts
And I get products 200,201,204,205,202
So WHERE clause does not work. I played with WITH - no result
So I tried as well:
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
OPTIONAL MATCH (n2)-[:PURCHASED]->(p2:product)
WHERE NOT (n:client)-[:PURCHASED]->(p:product)
RETURN p2
The difference is that I receive 3x 'null' (I'm not able to get product id's)
Finally I'd like to count relations with query:
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
OPTIONAL MATCH (n2)-[r:PURCHASED]->(p2:product)
WHERE p <> p2
RETURN p2.product_name, count(r) AS result
ORDER BY result DESC
But what I receive is... relation to product 202 = 3! (should be 2)
Solved! Go to Solution.
02-24-2020 09:47 AM
Hello,
In your case WHERE id(c1) < id(c2)
won't be needed. If a client can only have a single :PURCHASED relationship to a specific product, then there's no way that c2 can be equal to c1. That type of WHERE clause is often used to prevent getting mirrored results, where the same nodes can appear twice in the results because they can be matched in both directions (by flipping which node is associated with which variable), but for this query that cannot happen.
Regarding your earlier question:
I do not understand why 'WHERE p <> p2' is not working...there's no logical mistake in that
There is a logical mistake actually, because you're only performing this comparison between two nodes per row. You're not considering the collection of nodes because you haven't been working with a collection, just individual nodes.
This is easier to see if you examine the state of what's going on in the query just before you execute the WHERE clause.
Execute this query:
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
OPTIONAL MATCH (n2)-[:PURCHASED]->(p2:product)
RETURN p.product_id as p, p2.product_id as p2
And you will see the following results:
╒═══╤════╕
│"p"│"p2"│
╞═══╪════╡
│201│204 │
├───┼────┤
│201│200 │
├───┼────┤
│201│201 │
├───┼────┤
│201│202 │
├───┼────┤
│200│204 │
├───┼────┤
│200│200 │
├───┼────┤
│200│201 │
├───┼────┤
│200│202 │
├───┼────┤
│200│202 │
├───┼────┤
│200│205 │
├───┼────┤
│200│200 │
└───┴────┘
It's clear here that you're looking at individual nodes, collections don't come into play at all.
Your WHERE clause WHERE p <> p2
will only filter out the rows associated with the same p
and p2
nodes, so after the WHERE executes you're only stripping out the rows with identical values:
╒═══╤════╕
│"p"│"p2"│
╞═══╪════╡
│201│204 │
├───┼────┤
│201│200 │
├───┼────┤
│201│202 │
├───┼────┤
│200│204 │
├───┼────┤
│200│201 │
├───┼────┤
│200│202 │
├───┼────┤
│200│202 │
├───┼────┤
│200│205 │
└───┴────┘
So the reason why product 200 is present is because for that row, even though p2 = 200, p = 201. And product 201 is present because even though p2 = 201, for that row p = 200. For each row, p
and p2
are not collections, they are individual nodes from paths that match the pattern you MATCHed to earlier.
What you actually want here is to work with collections. You were provided a different workaround where you exclude the pattern of the client 100 purchasing the product. But if we work with lists instead, we can do the same thing:
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
WITH n2, collect(p) as productsInCommon
OPTIONAL MATCH (n2)-[:PURCHASED]->(p2:product)
WHERE NOT p2 IN productsInCommon
RETURN COLLECT(DISTINCT p2) AS otherProducts
In this case, we collect the products, and instead of using node inequality, our condition is that the product must not be in the list of products in common. This ensures that the query will check the node against every node in the list, rather than just against a single node per row.
And to better illustrate, let's look at the rows returned when we return after doing the first collection and optional match but before the filter:
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
WITH n2, collect(p) as productsInCommon
OPTIONAL MATCH (n2)-[:PURCHASED]->(p2:product)
RETURN [prod IN productsInCommon | prod.product_id] as productsInCommon, p2.product_id as p2
And we see this:
╒══════════════════╤════╕
│"productsInCommon"│"p2"│
╞══════════════════╪════╡
│[201,200] │204 │
├──────────────────┼────┤
│[201,200] │200 │
├──────────────────┼────┤
│[201,200] │201 │
├──────────────────┼────┤
│[201,200] │202 │
├──────────────────┼────┤
│[200] │202 │
├──────────────────┼────┤
│[200] │205 │
├──────────────────┼────┤
│[200] │200 │
└──────────────────┴────┘
You can see that the filtering, based on the list, will find the right nodes this time.
02-22-2020 03:24 PM
Hi Gosforth,
I created this cypher.
CREATE (c100:client {client_id: 100}),
(c101:client {client_id: 101}),
(c102:client {client_id: 102}),
(c103:client {client_id: 103}),
(p200:product {product_id: 200, product_name: "Product 200"}),
(p201:product {product_id: 201, product_name: "Product 201"}),
(p202:product {product_id: 202, product_name: "Product 202"}),
(p203:product {product_id: 203, product_name: "Product 203"}),
(p204:product {product_id: 204, product_name: "Product 204"}),
(p205:product {product_id: 205, product_name: "Product 205"}),
(c100)-[:PURCHASED]->(p200),
(c100)-[:PURCHASED]->(p201),
(c101)-[:PURCHASED]->(p200),
(c101)-[:PURCHASED]->(p201),
(c101)-[:PURCHASED]->(p202),
(c101)-[:PURCHASED]->(p204),
(c102)-[:PURCHASED]->(p200),
(c102)-[:PURCHASED]->(p202),
(c102)-[:PURCHASED]->(p205),
(c103)-[:PURCHASED]->(p203),
(c103)-[:PURCHASED]->(p204)
I changed the line from "WHERE NOT (n:client)-[:PURCHASED]->(p:product)" to "WHERE NOT (n)-[:PURCHASED]->(p2)" in your 2nd Cypher.
Then you can see 204,205,202 products.
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
OPTIONAL MATCH (n2)-[:PURCHASED]->(p2:product)
WHERE NOT (n)-[:PURCHASED]->(p2)
RETURN p2
Is this answer correct?
02-23-2020 03:37 AM
Thank you very much you dedicated your time to help me.
Yep, result seems to what I need. I do not understand why 'WHERE p <> p2' is not working...there's no logical mistake in that... So WHERE works here on relations not the collection of nodes.
Anyway, finally I'd like to count no of relations to p2 products; how many relations was to product_id: 205, product_id: 202, product_id: 204.
With the below query:
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
OPTIONAL MATCH (n2)-[r:PURCHASED]->(p2:product)
WHERE NOT (n)-[:PURCHASED]->(p2)
RETURN DISTINCT p2.product_id,count(r) AS no_of_relations
I get result:
|p2.product_id|no_of_relations|
|202|3|
|204|2|
|205|1|
Why product 202 has 3 relations? should be 2.
Ideal if I count relations of the p2 products where n & n2 customers have at least two products in common.
Regards,
G
02-23-2020 06:06 AM
Can you change the count(r) to count(distinct r)?
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
OPTIONAL MATCH (n2)-[r:PURCHASED]->(p2:product)
WHERE NOT (n)-[:PURCHASED]->(p2)
RETURN DISTINCT p2.product_id,count(distinct r) AS no_of_relations
It is possible there are multiple matches where the optional match is getting executed with the same result.
By making distinct you are counting a relationship only once.
02-23-2020 06:57 AM
Works. Thanks! Now I go to phase "show p2 products where n & n2 customers have at least two products in common"
02-23-2020 11:21 AM
I've finally managed to write query (studying forums) - is it OK?:
MATCH (c1:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(c2:client)
WHERE id(c1) < id(c2)
WITH c1, c2, collect(p) as commonProducts
WHERE size(commonProducts) >= 2
OPTIONAL MATCH (c2)-[r:PURCHASED]->(p2:product)
WHERE NOT (c1)-[:PURCHASED]->(p2)
Return DISTINCT p2.product_id, count(r) as no_of_relations
And it gives me recommendations of other products where clients have at least two products in common with client_id 100
p2.product_id | no_of_relations |
---|---|
202 | 1 |
204 | 1 |
I this second line is needless (dos nothing ?):
WHERE id(c1) < id(c2)
Regards,
G
02-24-2020 09:47 AM
Hello,
In your case WHERE id(c1) < id(c2)
won't be needed. If a client can only have a single :PURCHASED relationship to a specific product, then there's no way that c2 can be equal to c1. That type of WHERE clause is often used to prevent getting mirrored results, where the same nodes can appear twice in the results because they can be matched in both directions (by flipping which node is associated with which variable), but for this query that cannot happen.
Regarding your earlier question:
I do not understand why 'WHERE p <> p2' is not working...there's no logical mistake in that
There is a logical mistake actually, because you're only performing this comparison between two nodes per row. You're not considering the collection of nodes because you haven't been working with a collection, just individual nodes.
This is easier to see if you examine the state of what's going on in the query just before you execute the WHERE clause.
Execute this query:
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
OPTIONAL MATCH (n2)-[:PURCHASED]->(p2:product)
RETURN p.product_id as p, p2.product_id as p2
And you will see the following results:
╒═══╤════╕
│"p"│"p2"│
╞═══╪════╡
│201│204 │
├───┼────┤
│201│200 │
├───┼────┤
│201│201 │
├───┼────┤
│201│202 │
├───┼────┤
│200│204 │
├───┼────┤
│200│200 │
├───┼────┤
│200│201 │
├───┼────┤
│200│202 │
├───┼────┤
│200│202 │
├───┼────┤
│200│205 │
├───┼────┤
│200│200 │
└───┴────┘
It's clear here that you're looking at individual nodes, collections don't come into play at all.
Your WHERE clause WHERE p <> p2
will only filter out the rows associated with the same p
and p2
nodes, so after the WHERE executes you're only stripping out the rows with identical values:
╒═══╤════╕
│"p"│"p2"│
╞═══╪════╡
│201│204 │
├───┼────┤
│201│200 │
├───┼────┤
│201│202 │
├───┼────┤
│200│204 │
├───┼────┤
│200│201 │
├───┼────┤
│200│202 │
├───┼────┤
│200│202 │
├───┼────┤
│200│205 │
└───┴────┘
So the reason why product 200 is present is because for that row, even though p2 = 200, p = 201. And product 201 is present because even though p2 = 201, for that row p = 200. For each row, p
and p2
are not collections, they are individual nodes from paths that match the pattern you MATCHed to earlier.
What you actually want here is to work with collections. You were provided a different workaround where you exclude the pattern of the client 100 purchasing the product. But if we work with lists instead, we can do the same thing:
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
WITH n2, collect(p) as productsInCommon
OPTIONAL MATCH (n2)-[:PURCHASED]->(p2:product)
WHERE NOT p2 IN productsInCommon
RETURN COLLECT(DISTINCT p2) AS otherProducts
In this case, we collect the products, and instead of using node inequality, our condition is that the product must not be in the list of products in common. This ensures that the query will check the node against every node in the list, rather than just against a single node per row.
And to better illustrate, let's look at the rows returned when we return after doing the first collection and optional match but before the filter:
MATCH (n:client {client_id: 100})-[:PURCHASED]->(p:product)<-[:PURCHASED]-(n2:client)
WITH n2, collect(p) as productsInCommon
OPTIONAL MATCH (n2)-[:PURCHASED]->(p2:product)
RETURN [prod IN productsInCommon | prod.product_id] as productsInCommon, p2.product_id as p2
And we see this:
╒══════════════════╤════╕
│"productsInCommon"│"p2"│
╞══════════════════╪════╡
│[201,200] │204 │
├──────────────────┼────┤
│[201,200] │200 │
├──────────────────┼────┤
│[201,200] │201 │
├──────────────────┼────┤
│[201,200] │202 │
├──────────────────┼────┤
│[200] │202 │
├──────────────────┼────┤
│[200] │205 │
├──────────────────┼────┤
│[200] │200 │
└──────────────────┴────┘
You can see that the filtering, based on the list, will find the right nodes this time.
02-24-2020 12:48 PM
Thank you very much indeed!
All the sessions of the conference are now available online