cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Problem with single instance, online backup

cesar
Node Link

Hi everyone! I was wondering if you could help me with an issue I'm having with an online backup, single instance. Our database is 1.3TB, +2 billion nodes, +3 billion relationships, so it's not a small db, but it's running on an equally mountruos machine. The issue is that I ran the backup yesterday. After ~2h, it stoped writing to the backup files (the backup files are ~1.3TB), but the process was still going and using system resources. The log file was still being written with a sort of progress bar. I left it running and called it a day, expecting to see the backup finished the next morning. However, this morning (after ~19 hours since it started) I see the backup process is still running and using almost all CPU of the monster machine, but the backup files haven't been touched, and the log file was last written after ~9h of starting the process. This is the last entry on the log:

...
2019-05-16 16:24:46.543+0000 INFO [o.n.b.i.BackupOutputMonitor] Finish receiving transactions at 9486753
2019-05-16 16:24:46.613+0000 INFO [o.n.b.i.BackupOutputMonitor] Start recovering store
2019-05-16 16:27:28.306+0000 INFO [o.n.b.i.BackupOutputMonitor] Finish recovering store
....................  10%
....................  20%
....................  30%
....................  40%
....................  50%
...................

For reference, this is a view of the system resources use this morning. Yesterday when the process was "actually" running, it wasn't consuming nearly as much resources. This server only has Neo4j running. Currently there are no other big queries or processes running.

Is this expected? Is there something wrong with my backup? Should I just kill it and restart it?

1 REPLY 1

If you run online backup with default settings a consistency check is run after the backup. On such a large database this can take a very long time. I suggest to abort the currently running backup, prune the target folder and run it again with --check-consistency=false.