Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-05-2019 07:02 AM
Hello!
There is DataSet at my Notebook’s Virtual Machine:
2 millions unique Customers [:VISITED] 40000 unique Merchants.
Every [:VISIT] has properties: amount (double) and dt (date).
Every Customer has property “pty_id” (Integer).
And every Merchant has mcht_id (String) property.
One Customer may visit one Merchant more than one time. And of course one Customer may visit many Merchants. So there are 43 978 539 relationships in my graph between Customers and Merchants.
I have created Indexes:
CREATE INDEX on :Customer(pty_id)
CREATE INDEX on :Merchant(mcht_id)
Parameters of my VM are:
Oracle (RedHat) Linux 7 with 2 core i7, 2 GB RAM
Parameters of my Neo4j 3.5.7 config:
My tasks is:
Get top 10 Customers ordered by total_amount who spent their money at NOT specified Merchant(M) but visit that Merchants which have been visited by Customers who visit this specified Merchant(M)
My Solution is:
Let’s M will have mcht_id = "0000000DA5"
Then CYPHER query will be:
MATCH
(c:Customer)-[r:VISITED]->(mm:Merchant)<-[:VISITED]-(cc:Customer)-[:VISITED]->(m:Merchant {mcht_id: "0000000DA5"})
WHERE
NOT (c)-[:VISITED]->(m)
WITH
DISTINCT c as uc
MATCH
(uc:Customer)-[rr:VISITED]->()
RETURN
uc.pty_id
,round(100*sum(rr.amount))/100 as v_amt
ORDER BY v_amt DESC
LIMIT 10
;
Result is OK. I receive my answer:
uc.pty_id| v_amt
1433798 | 348925.94
739510 | 339169.83
374933 | 327962.95
1925125 | 327545.95
768789 | 327463.18
1851163 | 326690.5
1290014 | 323256.0
258667 | 322899.84
293300 | 321374.56
1111539 | 320392.02
The problem is that this result I have received after 437613 ms! It’s about 7 minutes!!! My estimated time for this query was about 10-20 seconds….
My Question is: What am I doing wrong???
07-05-2019 09:13 AM
Try this:
MATCH
(c:Customer)-[r:VISITED]->(mm:Merchant)
WHERE mm.mcht_id <> "0000000DA5"
RETURN
DISTINCT c.pty_id
,round(100*sum(r.amount))/100 as v_amt
ORDER BY v_amt DESC
LIMIT 10
07-08-2019 06:04 AM
This query just choked...and it doesn't restrict customers to merchants linked with original merch.
All the sessions of the conference are now available online