Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-06-2022 07:41 AM
Hi everyone, how are you doing?
Guys, I would like a little help. Actually I am doing a study to my company to verify the possibility to use Neo4j as our first NoSql data base.
Firstly I created a small use case with 10 nodes and 12 relationships with few registers. After see that this model could answer our questions I decided to go to the second step, the performance check.
To cheche the performance I am thinking in doing it in some fases:
After this first impression, we would like to use Pentaho to do the same use processes to check the performance.
So, what is my problem? During my fist MERGE test, I am observing a great slowness even with a volumetry that I consider low. Below I will put the time and after the command sample.
Rows: 1.000 Method: Merge with empty model, so only INSERT Time: less than 1 second
Rows: 1.000 Method: Merge with full model, so only UPDATE Time: less than 1 second
Rows: 10.000 Method: Merge with empty model, so only INSERT Time: 00:00:21
Rows: 10.000 Method: Merge with full model, so only UPDATE Time: 00:01:00
Rows: 50.000 Method: Merge with empty model, so only INSERT Time: 00:16:48
Rows: 50.000 Method: Merge with full model, so only UPDATE Time: 00:29:11
Rows: 100.000 Method: Merge with empty model, so only INSERT Time: 01:00:44
Rows: 100.000 Method: Merge with full model, so only UPDATE Time: 02:19:00
Command Sample
LOAD CSV WITH HEADERS FROM 'file:///File.csv' AS row
MERGE (p:Node {4 Keys})
ON CREATE SET 22 attributes
ON MATCH SET 22 attributes;
Environment
Neo4j Desktop (CPU: i5 2.11 GHz RAM: 16GB HD: SSD) – I am planning in install a server one after…;
As you can see the performance is not linear. When I insert 1.000 rows it last less then 1 second. After I try to insert 10 times more “10.000”, so I was expecting something near 10 seconds, but it last 1 minute. After I try to insert 50.000 rows and again it last more than I was expecting for. It last 16 minutes.
After I did this test but with relationships and the performance was worse.
This performance for me is not acceptable, because I will insert more than 20.000.000 (twenty millions nodes).
Could you guys help me with tips. I am sure that I am doing something wrong because I know that many large companies around the world are using Neo4j.
Regards, Igor Martins
06-06-2022 08:46 AM
Have you created an index on the properties your are merging on, as it performs a match first.
06-06-2022 11:50 AM
Thanks glilienfield.
In my first test I have didn't created any index because I was thinking that it will not make any difference because it is an insert. But I forgot that it is a merg so, it first check if it exist so the index make difference.
After your messagem I created the index and it works better. Now I will try to load more data from 1.000.000 to 20.000.000
Regards, Igor Martins
06-07-2022 05:03 AM
HI everyone, it is me again.
After create the index the merge in a empty database works fine. When I run the command to insert 1.000.000 rows the process last 34 seconds. After that I executed the command again to check the performande in update mode, but unfortunately the process last more than 10 hours so I decided to stop it.
Do you have some tip?
Regards, Igor Bastos Martins
06-07-2022 06:06 AM
You state your merge is on four keys. Are they all needed to uniquely identify each node? Did you create a composite index on them or separate indexes?
Can you try running your import with EXPLAIN and share the results. Maybe that will help identify the bottleneck and indicate if your indexes are being used.
06-08-2022 06:09 AM
Hi, after look for some perfoemance recomendation I found some noutes about 3 parameters:
dbms.memory.heap.initial_size
dbms.memory.heap.max_size
dbms.memory.pagecache.size
After change its values, the process works fine in INSERT and in UPDATE mode.
Thank you for you time and for help me.
Regards, Igor Martins
All the sessions of the conference are now available online