Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-18-2020 12:31 PM
We've started seeing sporadic "stuck" backups using the Java libs:
// https://mvnrepository.com/artifact/org.neo4j/neo4j-backup
implementation "org.neo4j:neo4j-backup:3.4.9"
Targeting a Neo4j Enterprise 3.4.3 we have this snippet:
@Override
public void performBackup(BackupConfig config) {
Neo4jBackupConfig backupConfig = (Neo4jBackupConfig) config;
OnlineBackup onlineBackup = getInstance(backupConfig);
OnlineBackup result = onlineBackup.backup(createBackupDirectory(backupConfig), Neo4jConstants.VERIFY_BACKUP)
.gatheringForensics(Neo4jConstants.GATHER_FORENSICS)
.withTimeout(Neo4jConstants.TIMEOUT_MS);
if (!Optional.ofNullable(result).isPresent()) {
throw new AssertionError("Backup failed. Please see attached log.");
}
if (!result.isConsistent()) {
throw new AssertionError("Backup is inconsistent. Please see attached log.");
}
}
public static final Boolean VERIFY_BACKUP = Boolean.TRUE;
public static final Boolean GATHER_FORENSICS = Boolean.TRUE;
// 5 minute timeout
public static final Long TIMEOUT_MS = 300000L;
And all we see from stdout logs the consistency check seems to stop w/o any corresponding memory / CPU spikes:
2020-01-18 12:41:42.192+0000 INFO [o.n.c.s.StoreCopyClient] Copying neostore
2020-01-18 12:41:42.192+0000 INFO [o.n.c.s.StoreCopyClient] Copied neostore 8.00 kB
2020-01-18 12:41:42.192+0000 INFO [o.n.c.s.StoreCopyClient] Done, copied 711 files
2020-01-18 12:41:52.332+0000 INFO [o.n.k.i.s.f.RecordFormatSelector] Selected RecordFormat:StandardV3_4[v0.A.9] record format
2020-01-18 12:41:52.332+0000 INFO [o.n.k.i.s.f.RecordFormatSelector] Format not configured. Selected format from the store: RecordFormat:StandardV3_4[v0.A.9]
.................... 10%
.................... 20%
.................... 30%
.................... 40%
.................... 50%
.................... 60%
.................... 70%
.................... 80%
.................... 90%
...................Checking node and relationship counts
.................... 10%
.................... 20%
.................... 30%
.................... 40%
.................... 50%
.................... 60%
.................... 70%
.................... 80%
.................... 90%
.................... 100%
01-18-2020 10:06 PM
Hello Mike,
How large is your backup?
The consistency check will take some memory and can be done at a later stage or on a different environment if it's large.
Can you retry with this:
public static final Boolean VERIFY_BACKUP = Boolean.FALSE;
You can do it manually with the built-in consistency checker tool:
https://neo4j.com/docs/operations-manual/3.4/tools/consistency-checker/
Kind regards,
J
01-19-2020 01:50 PM
Hi Jéremie,
On average the backup is around 100 Mb and tarballed a bit less than that. We set each backup job to use 2 Gb of memory but aren't seeing any memory issues / OOM errors. I'll try turning off the consistency checker as we run this every backup hourly in an attempt to verify the integrity of the backup. Looking at the underlying infrastructure this might be a compute problem with bursting CPUs getting throttled.
Does the consistency checker utilize a lot of CPU that we're not accounting for?
Thanks,
Mike
01-20-2020 08:36 PM
You are using version 3.4.9 which will be out of support in a couple of weeks.
On large graphs, this can be an issue:
Among many other things, this is was adressed and improved in the next major version.
It might be worth having a look.
01-20-2020 08:50 PM
We're investigating the migration path for going to 4.0. Is that what you mean by major version or is there a smaller hop for the neo4j-backup version train? We run the consistency checker on every backup before tarballing and shipping to S3 as a sanity check.
All the sessions of the conference are now available online