Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-29-2021 08:47 AM
Guys, I'm having a performance issue with Neo4j/GDS. Whenever I try to train GraphSAGE on GDS, I get
Failed to invoke procedure gds.beta.graphSage.train: Caused by: java.lang.OutOfMemoryError: Java heap space
I'm using neo4j 4.3.7 and already have altered my neo4j.conf in order to set specific heap sizes:
# Java Heap Size: by default the Java heap size is dynamically calculated based
# on available system resources. Uncomment these lines to set specific initial
# and maximum heap size.
dbms.memory.heap.initial_size=4g
dbms.memory.heap.max_size=16g
When running gds.beta.graphSage.train.estimate
I get that the requiredMemory is "[113 MiB ... 4205 MiB]". I'm running neo4j Community Edition on a Ubuntu server with 64GB of RAM and am totally out of ideas on what to try next. Can someone please help?
Here's the estimate command I'm running:
CALL gds.beta.graphSage.train.estimate(
'AmazonProducts',
{
modelName: 'AmazonSAGE',
featureProperties: ['embedded_str'],
aggregator: 'mean',
activationFunction: 'relu',
embeddingDimension: 128,
sampleSizes: [50, 25, 12],
batchSize: 500,
concurrency: 4
}
)
And the output is shown in the image below:
Any help is very much appreciated. Thanks
12-29-2021 09:58 AM
Can you try running gds.graph.list()
to see whether you've got memory tied up doing other things? If you're running GDS EE, you can also try gds.alpha.systemMonitor
to give you an admin level view of all running procedures, available/used memory, etc.
Running gds.debug.sysInfo
will also give you more information about your system - the versions you're running, etc - and sharing that information will probably be helpful.
12-29-2021 10:15 AM
Okay, here are the outputs of the commands you asked me to run:
gds.graph.list()
:
gds.debug.sysInfo
[
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"gdsVersion",
"1.8.1"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"gdsEdition",
"Unlicensed"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"neo4jVersion",
"4.3.7"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"minimumRequiredJavaVersion",
"11"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"featurePreAggregation",
false
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"featureSkipOrphanNodes",
false
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"featureMaxArrayLengthShift",
{
"low": 28,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"featureKernelTracker",
false
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"featurePropertyValueIndex",
false
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"featureParallelPropertyValueIndex",
false
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"featureBitIdMap",
true
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"featureUncompressedAdjacencyList",
false
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"featureReorderedAdjacencyList",
false
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"buildDate",
"2021-12-20_10:47:23"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"buildJdk",
"11.0.13+8 (Eclipse Adoptium)"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"buildJavaVersion",
"11.0.13"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"buildHash",
"84bbf5b2bf453d86d1eb649961d8eddbd20cd48e"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"availableCPUs",
{
"low": 12,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"physicalCPUs",
{
"low": 12,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"availableHeapInBytes",
{
"low": -201326592,
"high": 1
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"availableHeap",
"8000 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"heapFreeInBytes",
{
"low": -1219802136,
"high": 1
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"heapFree",
"7028 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"heapTotalInBytes",
{
"low": -201326592,
"high": 1
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"heapTotal",
"8000 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"heapMaxInBytes",
{
"low": -201326592,
"high": 1
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"heapMax",
"8000 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"offHeapUsedInBytes",
{
"low": 388535224,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"offHeapUsed",
"370 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"offHeapTotalInBytes",
{
"low": 405352448,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"offHeapTotal",
"386 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapNonNmethodsUsedInBytes",
{
"low": 2165120,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapNonNmethodsUsed",
"2114 KiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapNonNmethodsTotalInBytes",
{
"low": 2555904,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapNonNmethodsTotal",
"2496 KiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolMetaspaceUsedInBytes",
{
"low": 295242360,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolMetaspaceUsed",
"281 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolMetaspaceTotalInBytes",
{
"low": 306393088,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolMetaspaceTotal",
"292 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapProfiledNmethodsUsedInBytes",
{
"low": 32892416,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapProfiledNmethodsUsed",
"31 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapProfiledNmethodsTotalInBytes",
{
"low": 33161216,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapProfiledNmethodsTotal",
"31 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCompressedClassSpaceUsedInBytes",
{
"low": 43873464,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCompressedClassSpaceUsed",
"41 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCompressedClassSpaceTotalInBytes",
{
"low": 48758784,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCompressedClassSpaceTotal",
"46 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1EdenSpaceFreeInBytes",
{
"low": 85983232,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1EdenSpaceFree",
"82 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1EdenSpaceTotalInBytes",
{
"low": 440401920,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1EdenSpaceTotal",
"420 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1EdenSpaceMaxInBytes",
{
"low": -1,
"high": -1
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1EdenSpaceMax",
"N/A"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1OldGenFreeInBytes",
{
"low": -1305785368,
"high": 1
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1OldGenFree",
"6946 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1OldGenTotalInBytes",
{
"low": -641728512,
"high": 1
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1OldGenTotal",
"7580 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1OldGenMaxInBytes",
{
"low": -201326592,
"high": 1
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1OldGenMax",
"8000 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1SurvivorSpaceFreeInBytes",
{
"low": 0,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1SurvivorSpaceFree",
"0 Bytes"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1SurvivorSpaceTotalInBytes",
{
"low": 0,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1SurvivorSpaceTotal",
"0 Bytes"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1SurvivorSpaceMaxInBytes",
{
"low": -1,
"high": -1
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolG1SurvivorSpaceMax",
"N/A"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapNonProfiledNmethodsUsedInBytes",
{
"low": 14422528,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapNonProfiledNmethodsUsed",
"14084 KiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapNonProfiledNmethodsTotalInBytes",
{
"low": 14483456,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"poolCodeheapNonProfiledNmethodsTotal",
"14144 KiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"freePhysicalMemoryInBytes",
{
"low": -1103736832,
"high": 3
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"freePhysicalMemory",
"15331 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"committedVirtualMemoryInBytes",
{
"low": 984330240,
"high": 4
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"committedVirtualMemory",
"16 GiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"totalPhysicalMemoryInBytes",
{
"low": -1333235712,
"high": 15
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"totalPhysicalMemory",
"62 GiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"freeSwapSpaceInBytes",
{
"low": 1259290624,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"freeSwapSpace",
"1200 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"totalSwapSpaceInBytes",
{
"low": 2147479552,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"totalSwapSpace",
"2047 MiB"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"openFileDescriptors",
{
"low": 353,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"maxFileDescriptors",
{
"low": 1048576,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"vmName",
"OpenJDK 64-Bit Server VM"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"vmVersion",
"11.0.11+9-Ubuntu-0ubuntu2.18.04"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"vmCompiler",
"HotSpot 64-Bit Tiered Compilers"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"containerized",
true
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"dbms.security.procedures.unrestricted",
"gds.*"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"dbms.memory.pagecache.size",
null
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"dbms.tx_state.memory_allocation",
"ON_HEAP"
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"dbms.memory.off_heap.max_size",
{
"low": -2147483648,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"dbms.memory.transaction.global_max_size",
{
"low": 0,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
},
{
"keys": [
"key",
"value"
],
"length": 2,
"_fields": [
"dbms.memory.transaction.max_size",
{
"low": 0,
"high": 0
}
],
"_fieldLookup": {
"key": 0,
"value": 1
}
}
]
12-29-2021 10:29 AM
Skimming this, it looks like you've got ~7.5G in free heap (heapFreeInBytes
). The simplest thing to do is try it again with more heap (and I'd set initial_size
and max_size
to the same value - probably trying something closer to 32G+, instead of the 16 you have).
At the end of the day, though, GraphSAGE is a memory intensive and fairly slow algorithms - and your graph is pretty densely connected (3.5M nodes for 150k relationships), so once you get over the memory requirements, it will still take a while to compute.
I might try FastRP - Fast Random Projection - Neo4j Graph Data Science - which can support node properties and weights, just like graphSAGE, but is much faster to compute.
12-29-2021 10:36 AM
okay, I will try adding extra memory, but isn't it weird that GDS estimates that 4205MiB will be required when in fact much more memory is needed?
Also, my graph actually has ~150k nodes and ~3.5M relationships, not the other way round, as you suggested.
Anyway, thanks for your answer. I will keep this thread up to date once I make some progress.
12-29-2021 12:06 PM
So, yeah, I've tried both increasing the amount of RAM to 32GB and deleting about half of the nodes in my graph and it would still not work. Even though gds.beta.graphSage.train.estimate
said it would only take a couple gigs of RAM, it wasn't the case at all on my application. I am now giving up on GDS GraphSAGE and looking for alternative solutions such as using DGL to implement GraphSAGE myself.
Thanks @alicia.frame1 for your support.
12-29-2021 12:54 PM
I'm sorry it didn't work out for you - I'd recommend trying out FastRP, which is a different, but much more scalable embedding technique.
For memory estimation, please keep in mind that (1) they're estimates, and it's hard to get an accurate estimation without actually running on your graph, and (2) GraphSAGE is in the beta
tier, so there are further optimizations on the roadmap.
Another factor to keep in mind is that it looks like you're using the community edition of the library. Enterprise edition unlocks improved compression techniques (so the in-memory graph uses less memory) as well as unlimited parallelization - so algorithms finish more quickly.
Best of luck!
All the sessions of the conference are now available online