Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-25-2022 10:33 PM - edited 12-25-2022 10:36 PM
I have a huge CSV that I'm trying to import, I've already created the relevant nodes of both types from it, and now I'm creating the relationships. I'm using this query to be re-entrant on failures:
USING PERIODIC COMMIT LOAD CSV FROM 'file:///posts.csv' AS line
UNWIND split(line[1], ' ') AS tag
MATCH (i:ImageNode {image_id: line[0]}), (t:TagNode {value: tag})
MERGE (i)-[r:TAGGED]->(t)
ON CREATE SET r.created_at = timestamp() / 1000.0
When I run it on the shell as this:
$ cat posts.cql | cypher-shell -u neo4j -p [the password]
JVM memory usage of the keeps climbing. Why? Am I not doing something to clear unused data out of memory when it commits?
Cypher-Shell 4.1.12 and Neo4j Driver 4.1.4
12-26-2022 01:52 AM
One thing you can do is move the match on ImageNode to before the unwind. As it stands, you are repeating this same match for each tag element on a row. You only need to match the tag and create the relationship after the unwind.
have you monitored the jvm with tools like jConsole or visualVm to see what is happening?
https://www.rapid7.com/blog/post/2012/12/31/guide-to-monitoring-jvm-memory-usage-draft/
01-04-2023 04:43 AM
Hi @giff-h
Can you try something like:
:auto LOAD CSV FROM 'file:///posts.csv' AS line
CALL {
with line
UNWIND split(line[1], ' ') AS tag
MATCH (i:ImageNode {image_id: line[0]}), (t:TagNode {value: tag})
MERGE (i)-[r:TAGGED]->(t)
ON CREATE SET r.created_at = timestamp() / 1000.0
} IN TRANSACTIONS OF 10 ROWS
All the sessions of the conference are now available online