Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
03-06-2019 06:14 PM
Hi, I'm very new to using the Neo4j database and OGM, but have got it basically working for what I'm doing (reading through XML documents and feeding that structure along with associated metadata into Neo4j). I'm using ogm 3.1.7 with the bolt driver.
The question I have at the moment is around the performance of inserts. In the specific case I'm looking into the session.save call is taking around about 5 seconds to complete (very roughly); this is for inserting around 70 nodes in a descending tree structure.
Does this seem like a normal amount of time for this operation? This is all running on my i7 laptop with an SSD. Is there any logging I can turn on to track what is happening?
I've seem a couple of posts on the internet suggesting its better to insert all the nodes separately and then do another insert to put in the relationships, or alternatively to use the java API rather than OGM for speed; but I wasn't sure if this more related to older OGM versions?
Not sure if this is a bit too vague, but if anyone has any suggestions around what I should be expecting performance wise, or thoughts on how to dig a little deeper that would be great.
Thanks in advance,
Matt
03-07-2019 01:19 AM
A few questions come to mind:
How much data do you already have in your database?
Does your save() operation includes merges?
Do you have indexes in the attributes you use in the merges?
How much memory have you assigned to neo4j (in relation to database size)?
03-07-2019 10:11 AM
Hi, thanks for getting back to me. To answer your questions:
I will try cutting down on the data I'm loading - hopefully that will remove a lot of the possible merges. Is it possible to get OGM to log the statements its issuing?
Also I guess the meta question around this is, do I carry on with OGM, or look at doing it more with the native java API? I don't mind to much either way - just want to head in the right/best direction
Thanks,
Matt
03-07-2019 12:17 PM
Hmm, OK, I got the logging working. I found the note in the docs which just says it uses logback, and then links to logback - possibly this is a bit brief, but then I found another post on the blog with an example logback file (might be worth putting an example in the docs), so got it working in the end - it was a bit trickier from me as this is embedded in a Dropwizard project, so the configuration was a bit different in my case.
Anyway. Looking at the logs, there is a command issued:
UNWIND {rows} as row MERGE (n:Element
:LabeledParagraph
{id: row.props.id})
SET n=row.props RETURN row.nodeRef as ref, ID(n) as id, {type} as type with params
{type=node, rows=[{nodeRef=-34, props={label=d, id=d8301edf-3d3c-4269-bb70-8f17947f824a}},
{nodeRef=-66, props={label=aa, id=dd53e725-abe0-4b00-917b-4fc478762631}},
{nodeRef=-36, props={label=e, id=b2a15783-7152-41da-93b5-e4b2ff286693}},
{nodeRef=-38, props={label=f, id=f56c7ddc-6caa-45b1-9329-540618b99087}},
{nodeRef=-102, props={label=iv, id=9eb86c36-3d45-4266-9f65-ccb94b0e911e}},
{nodeRef=-104, props={label=A, id=e568aa5f-7a10-4bdb-b0d0-182bf6cfb9c0}},
{nodeRef=-42, props={label=a, id=73c80757-4fea-42e1-bc8e-5097733721bb}},
{nodeRef=-74, props={label=ia, id=e0525c0f-e399-4c4d-ac10-97b5ed6dbf8f}},
{nodeRef=-106, props={label=B, id=c2dbbae9-8e00-42fc-b2ef-bb8c1a5ff2e0}},
{nodeRef=-44, props={label=b, id=716e6c2a-55d5-4fdd-8a68-54d2b674c880}},
{nodeRef=-46, props={label=c, id=94b35511-020f-49f9-b03e-c15708e8de9a}},
{nodeRef=-48, props={label=d, id=0bbcaa1e-a867-4b45-a399-2741b6bfdf64}},
{nodeRef=-50, props={label=e, id=29262d29-8308-4902-960b-445bab39646e}},
{nodeRef=-52, props={label=f, id=de351745-9b7d-4fd9-9db9-59b4a20a1f5c}},
{nodeRef=-22, props={label=a, id=9e6aaec4-f0a7-49f1-80e7-a669f977ed4b}},
{nodeRef=-54, props={label=g, id=05919bb0-f9e0-44c7-b88e-b03aa6e686e2}},
{nodeRef=-24, props={label=b, id=73f307b2-33ad-41c0-af81-79ceeee526fa}},
{nodeRef=-56, props={label=h, id=8f1c8ba7-4ad6-4b7e-b2f6-e7c5db0783c9}},
{nodeRef=-28, props={label=a, id=09e2975f-d26d-4042-b5e7-8a25c279aa07}},
{nodeRef=-30, props={label=b, id=d6bf4cb4-80c4-44f3-92b2-7a0b00ce3cac}},
{nodeRef=-32, props={label=c, id=f13ba92e-08b0-4360-9125-d78b1c51fd6c}}]}
Which seems to basically take almost all the time for the save. As part of loading the document, my current setup is to issue a cypher command to remove all of the nodes under the document node, and then load all of the newly collected nodes to connect to the existing document (basically its a replace of the structure under a document). Is this what could be causing this request? Is there a better way to deal with loading the nodes?
Thanks in advance for any thoughts or suggestions.
Matt
03-07-2019 01:04 PM
Ah, I had a look at that, and thought about your mention of indexes. I see by default indexes are off (is this correct?), so I wondered if the issue was that it was looking for elements by ID, but with no index. I've just tried out setting the configuration with 'autoIndex("update")' and this looks to have taken the save operation down to around 200 miliseconds - which looks fine.
Does what I've deduced and done sound rational/reasonable/correct?
Thanks,
Matt
09-15-2021 07:36 PM
I have a Spring Boot solution with a single Node entity class using dynamic labels (@Labels) and dynamic properties @Properties). Saveall without an index decrease the performance after each saveall.
I expect that saveall uses a merge for each node that it creates. Creating an index on id will solve your performance problemen. Performance now decreases each time that you call saveall, (Also see @Properties(prefix="node",delimeter="_", allowCast=true for creating a dynamic properties collection)
Node is here the Node entity class. (Issue that i see in Neo4j browser is that is shows total count for Node (my entity class) and the totals for each dynamic label,
CREATE INDEX ON :Node(id)
Creating the index solved the performance issue for me. Saving 5300 Nodes and about 3000 Edges now each time in < 2s. Without the index 10s increasing after each saveall 20s, 35s, ......
All the sessions of the conference are now available online