cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Memory usage keeps climbing in PERIODIC COMMIT

giff-h
Node

I have a huge CSV that I'm trying to import, I've already created the relevant nodes of both types from it, and now I'm creating the relationships. I'm using this query to be re-entrant on failures:

 

USING PERIODIC COMMIT LOAD CSV FROM 'file:///posts.csv' AS line
UNWIND split(line[1], ' ') AS tag
MATCH (i:ImageNode {image_id: line[0]}), (t:TagNode {value: tag})
MERGE (i)-[r:TAGGED]->(t)
ON CREATE SET r.created_at = timestamp() / 1000.0

 

When I run it on the shell as this:

$ cat posts.cql | cypher-shell -u neo4j -p [the password]

JVM memory usage of the keeps climbing. Why? Am I not doing something to clear unused data out of memory when it commits?

Cypher-Shell 4.1.12 and Neo4j Driver 4.1.4

2 REPLIES 2

One thing you can do is move the match on ImageNode to before the unwind. As it stands, you are repeating this same match for each tag element on a row. You only need to match the tag and create the relationship after the unwind. 

have you monitored the jvm with tools like jConsole or visualVm to see what is happening?  

https://www.rapid7.com/blog/post/2012/12/31/guide-to-monitoring-jvm-memory-usage-draft/

Hi @giff-h 

Can you try something like:

:auto   LOAD CSV FROM 'file:///posts.csv' AS line
CALL {
    with line
    UNWIND split(line[1], ' ') AS tag
    MATCH (i:ImageNode {image_id: line[0]}), (t:TagNode {value: tag})
    MERGE (i)-[r:TAGGED]->(t)
    ON CREATE SET r.created_at = timestamp() / 1000.0
} IN TRANSACTIONS OF 10 ROWS
Oh, y’all wanted a twist, ey?
Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online