Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-17-2020 05:26 AM
I have a Neo4j 4.1.0 community edition setup on an EC2 instance (Ubuntu 18.04) with 16 GB RAM. The size of the database is 211 M, determined by running
du -hs /var/lib/neo4j/data/databases/neo4j/
which is made up of about 93K nodes of 3 labels with a single property each.
I have configured the following settings as suggested by neo4j-admin memrec
.
dbms.memory.heap.initial_size=6g
dbms.memory.heap.max_size=6g
dbms.memory.pagecache.size=7g
I am running the following query which is taking about 6 minutes to get completed.
MATCH (person:
Person), (person:
Person)-[r0:
STUDIED_AT]-(college:
College), (college:
College)-[r]-(x) RETURN type(r) AS label, last(labels(x)) AS target, count(r) AS count ORDER BY count(r) DESC
Can someone help me understand why this query is taking so long to run although the size of the graph is pretty small and the system specs are good enough? Also, is there a way to speed up the execution considerably without modifying the query (because the query is coming from popoto.js and I do not have much control over it).
I have already tried the following:
Couple of more questions:
07-17-2020 06:01 AM
Can you post the output of the query with prepended with the keyword EXPLAIN ?
This shows the processing done for the query and gives more insight.
See https://neo4j.com/docs/cypher-manual/current/query-tuning/how-do-i-profile-a-query/ for more info
07-17-2020 06:09 AM
Here is the output of EXPLAIN. Please let me know if you need more details.
07-17-2020 06:28 AM
You probably can rewrite it to which avoids some cartesian duplication:
07-17-2020 06:37 AM
Unfortunately I can't edit the query. It's created internally by a js library which I am using for my application. So firstly I am trying to assess if this performance (given the size of the data and the machine configuration) is warranted and if there is a way to configure neo4j for faster performance
07-17-2020 06:45 AM
What js library is that?
Even if you can't change the generated code it is interesting to know how it compares to the generated query.
07-17-2020 06:48 AM
The library is popoto.js
07-17-2020 06:59 AM
Sorry not familiar with it, perhaps others are 🙂
07-17-2020 07:09 AM
Thanks for trying to help. Would you be able to comment on whether this performance (given the size of the data and the machine configuration) is warranted?
07-17-2020 07:17 AM
6 minutes seems outrageous long, which instance type are you using?
I would love to see how much time is shaved off with the query rewrite.
Even if you can't "fix" the query its good to know if this helps.
Are you able to download the dataset and try it on a local Neo4J desktop instance?
Just to see how it compares to the EC2 instance..
07-17-2020 07:19 AM
you might want to post the output of PROFILE as well, just to get a bit more insight.
07-17-2020 08:03 AM
Here you go, thanks for looking
07-17-2020 08:04 AM
I'm not sure if it is zoomable. Here is the link to the image in case it is not.
07-22-2020 08:06 AM
As you can notice the query causes an enormous cartesian product, this is why its so slow.
What is it you are trying to build?
I would investigate in getting popoto to be smarter with the query or move away from popoto.
07-24-2020 05:45 AM
Thanks. I am trying to build a web interface for neo4j to make a dataset available for users to explore. I figured out a way to edit the queries created by popoto on the server side. That resolved the issue. Thanks for looking into this.
07-24-2020 06:04 AM
Great thanks for the update, appreciated!
07-14-2021 09:04 AM
Hi, Could you please share the way to edit queries created by popoto on server side. Also I want to know how we can write custom queries in popoto js . I am trying to do it with help of schema. But your help means a lot to me. Thanks in advance.
All the sessions of the conference are now available online