cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Perfomance Problem

Hello!

There is DataSet at my Notebook’s Virtual Machine:

2 millions unique Customers [:VISITED] 40000 unique Merchants.
Every [:VISIT] has properties: amount (double) and dt (date).
Every Customer has property “pty_id” (Integer).
And every Merchant has mcht_id (String) property.

One Customer may visit one Merchant more than one time. And of course one Customer may visit many Merchants. So there are 43 978 539 relationships in my graph between Customers and Merchants.

I have created Indexes:

CREATE INDEX on :Customer(pty_id)
CREATE INDEX on :Merchant(mcht_id)

Parameters of my VM are:

Oracle (RedHat) Linux 7 with 2 core i7, 2 GB RAM

Parameters of my Neo4j 3.5.7 config:

  • dbms.memory.heap.max_size=1024m
  • dbms.memory.pagecache.size=512m

My tasks is:

Get top 10 Customers ordered by total_amount who spent their money at NOT specified Merchant(M) but visit that Merchants which have been visited by Customers who visit this specified Merchant(M)

My Solution is:

Let’s M will have mcht_id = "0000000DA5"

Then CYPHER query will be:

MATCH 
  (c:Customer)-[r:VISITED]->(mm:Merchant)<-[:VISITED]-(cc:Customer)-[:VISITED]->(m:Merchant {mcht_id: "0000000DA5"}) 
WHERE 
  NOT (c)-[:VISITED]->(m) 
WITH 
  DISTINCT c as uc 
MATCH 
  (uc:Customer)-[rr:VISITED]->()
RETURN 
  uc.pty_id
  ,round(100*sum(rr.amount))/100 as v_amt 
ORDER BY v_amt DESC
LIMIT 10
;

Result is OK. I receive my answer:


uc.pty_id| v_amt
1433798  | 348925.94
739510   | 339169.83
374933   | 327962.95
1925125  | 327545.95
768789   | 327463.18
1851163  | 326690.5
1290014  | 323256.0
258667   | 322899.84
293300   | 321374.56
1111539  | 320392.02

The problem is that this result I have received after 437613 ms! It’s about 7 minutes!!! My estimated time for this query was about 10-20 seconds….

My Question is: What am I doing wrong???

2 REPLIES 2

ameyasoft
Graph Maven

Try this:

MATCH
(c:Customer)-[r:VISITED]->(mm:Merchant)

WHERE mm.mcht_id <> "0000000DA5"

RETURN
DISTINCT c.pty_id
,round(100*sum(r.amount))/100 as v_amt
ORDER BY v_amt DESC
LIMIT 10

This query just choked...and it doesn't restrict customers to merchants linked with original merch.