Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
02-05-2021 02:10 PM
What I want to do is find the members of a region who have not bought any of the n most popular nutrients in that region.
Getting the most popular nutrients isn't hard:
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]-(:customer)-[:madePurchase]->(t:transaction)
RETURN t.item_name, count(t) AS frequency
ORDER BY frequency DESC
LIMIT 3
And then it's not hard to filter by people who haven't bought any of these:
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]-(c:customer)-[:madePurchase]->(t:transaction)
WHERE NOT t.item_name IN ["ITEM 1", "ITEM 2", "ITEM 3"]
RETURN DISTINCT(c.customer_name)
What I want to do is somehow combine the two, getting the list [ITEM 1...]
directly and then filtering.
I've spent half a day messing around with CALL {}
and haven't been able to get it to work.
My most recent attempt is here:
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]-(c:customer)-[:madePurchase]->(t:transaction)
WHERE NOT t.item_name IN
CALL {
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]-(:customer)-[:madePurchase]->(t:transaction)
RETURN t.item_name, count(t) AS frequency
ORDER BY frequency DESC
LIMIT 5
}
RETURN DISTINCT(c.customer_name)
I've also tried various uses of WITH
to sort transactions before returning them, but haven't gotten anywhere with that either.
02-08-2021 01:46 AM
Hi trent.fowler,
here are some snippets of code that could help you:
WITH collect(t.item_name) AS transactions
WITH t, count(t) as frequency
ORDER BY frequency DESC
WITH collect(t.item_name) AS transactions
WITH t, count(t) as frequency
ORDER BY frequency DESC
WITH collect(t.item_name)[0..3] AS transactions
Now, I hope that you can connect your two queries with that middle part.
Regards,
Elena
02-08-2021 05:21 PM
Elena,
Thanks so much for your reply, that was a big leap forward.
When I tried to connect my original queries with this bridge, I encountered behavior I didn't expect.
For example, this:
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]->(c:customer)-[:madePurchase]->(t:transaction)
WITH t, count(t) as frequency
ORDER BY frequency DESC
WITH collect(t.item_name)[0..3] AS transactions
RETURN c.customer_name
Throws a Variable
c not defined (line 6, column 8 (offset: 262)) "RETURN c.customer_name"
error.
Does collect()
clear the namespace, or something? I read the documentation but found no answer.
Here was my full attempt to bridge my original queries with your code:
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]->(c:customer)-[:madePurchase]->(t:transaction)
WITH t, count(t) as frequency
ORDER BY frequency DESC
WITH collect(t.item_name)[0..3] AS transactions
WHERE NOT t.item_name IN transactions
RETURN DISTINCT(c.customer_name)
This is awfully close to what I'm after, if I can figure out what's going wrong with Neo4j not recognizing the variable names.
02-12-2021 01:43 PM
Try this:
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]
->(c:customer)-[:madePurchase]->(t:transaction)
WITH t, count(t) as frequency
ORDER BY frequency DESC
WITH collect(t.item_name)[0..3] AS transactions
MATCH (c1:customer)-[:madePurchase]->(t1:transaction)
WHERE NOT t1.item_name IN transactions
RETURN DISTINCT(c1.customer_name)
02-15-2021 01:15 AM
I would further improve the query:
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]
->(c:customer)-[:madePurchase]->(t:transaction)
WITH t, count(t) as frequency
ORDER BY frequency DESC
WITH collect(t.item_name)[0..3] AS transactions
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]-(c1:customer)
WHERE NOT any(t_item_name in transactions WHERE EXISTS ( (c1)-[:madePurchase]->(:transaction{item_name:t_item_name}) ) )
RETURN c1.customer_name
This should be more performant and will account for customers without any transactions
02-15-2021 10:55 AM
Thanks so much for the help! I hope you don't mind my asking a follow up question, as I want to understand what's happening.
I ran a 'naive' version of the query which has a less complicated WHERE
clause, and the list it returns is not the same as the one returned by your query (though there's plenty of overlap):
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]
->(c:customer)-[:madePurchase]->(t:transaction)
WITH t, count(t) as frequency
ORDER BY frequency DESC
WITH collect(t.item_name)[0..3] AS transactions
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]-(c1:customer)-[:madePurchase]->(t:transaction)
WHERE NOT t.item_name IN transactions
RETURN DISTINCT c1.customer_name
The list returned by this query and by manually checking is the same.
I read WHERE NOT any(t_item_name in transactions WHERE EXISTS ( (c1)-[:madePurchase]->(:transaction{item_name:t_item_name}) ) )
as filtering out any customers with no transactions, but that can't be right because I grabbed one of the customers from the list returned by the naive query that wasn't in the list returned by yours and there were transactions.
So my question is, what exactly is going on in your WHERE
clause?
02-15-2021 12:03 PM
Your query excludes transactions of items we identified in the first step and returns unique customers. But if a customer bought something of the 3 and as well something else, the customer will be returned by your query.
My query excludes every customer that had at least 1 transaction including one of the three items identified in the first step. I. e. Relationship from a customer to a transaction of the items specified in the list not exists.
02-16-2021 12:43 AM
btw thanks for mentioning the "CALL" for subqueries - I've never used it before 🙂
For the purpose of exercise I have rewritten the query with OPTIONAL MATCH - it might better self-explain what is happening:
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]
->(c:customer)-[:madePurchase]->(t:transaction)
WITH t, count(t) as frequency
ORDER BY frequency DESC
WITH collect(t.item_name)[0..3] AS transactions
MATCH (:region {reg: 'Central Cornbelt'})-[:hasDiv]->(:division)-[:hasRep]->(:Sales_rep)-[:hasCustomer]-(c1:customer)
OPTIONAL MATCH (c1:customer)-[:madePurchase]->(t:transaction)
WHERE t.item_name in transactions
WITH c1, count(t) as cnt
WITH c1, (cnt=0) as no_such_transation_flag
WHERE no_such_transation_flag
RETURN DISTINCT c1.customer_name
02-09-2021 02:29 AM
Hi,
by using "WITH" you loose all variables that you are not taking "with" you ;-). So, by adding a ", c" to both your WITH-statements, the error should disappear.
Regards,
Elena
All the sessions of the conference are now available online