Neo4j

susheelpatil · ‎05-13-2022

Hi,

We are running data ingest in neo4j enterprise 4.4.5 with large data set. we have 3 cores and 1 read replica. Running AWS r6i.xlarge(32GB, quad core). Setting heapsize = 12GB for both initial and max. Pagecache size = 12GB. Memory allocated for each member is 24G. After some time the RAM is fully used and my EKS pods are restarting after some time become unresponsive.

Any suggestion?

jo_nathan · ‎05-17-2022

Hi, thanks for posting.
Unfortunately, more information is required.
What are you ingesting - CSV?
Take a look into the debug.log and neo4j.log.
I found your other post, is this the same issue?

Chances are, you might be running everything in a big transaction that consumes all your memory. Split the ingestion into smaller transactions.

To do this have a look at
CALL {} Subqueries (which will replace PERIODIC COMMIT)
or apoc.periodic.commit and apoc.periodic.iterate

susheelpatil · ‎05-18-2022

Hi Jonathan,

We are ingesting json files. We are already using periodic iterate. We are hosting neo4j in EKS cluster. Some of the exception we see in our logs. This exception is usually when pod is restarting for memory issue and client unable to connect to leader. ALso, we are giving 1 hr break between jobs to see if memory clears up ut it is not happeneing.

ntd-vendors.svc.cluster.local:5000,None,List(),Some(10000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused)
2022-05-18 08:40:15.013+0000 INFO [c.n.c.m.RaftChannelPoolService] Channel created [[id: 0x154e69d9]]
2022-05-18 08:40:15.013+0000 INFO [c.n.c.p.i.ClientChannelInitializer] Initializing client channel [id: 0x154e69d9]
2022-05-18 08:40:15.014+0000 WARN [c.n.c.m.RaftSender] Failed to acquire channel [Address: discovery-ntd-neo4j-2.ntd-vendors.svc.cluster.local:7000]
java.util.concurrent.ExecutionException: io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: discovery-ntd-neo4j-2.ntd-vendors.svc.clus
ter.local/172.16.7.148:7000
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) ~[?:?]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) ~[?:?]
at com.neo4j.causalclustering.messaging.RaftSender.waitForPooledDataChannel(RaftSender.java:108) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.messaging.RaftSender.tryDataChannel(RaftSender.java:93) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.messaging.RaftSender.send(RaftSender.java:45) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.messaging.RaftSender.send(RaftSender.java:23) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.messaging.RaftOutbound.send(RaftOutbound.java:58) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.messaging.RaftOutbound.send(RaftOutbound.java:22) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.messaging.Outbound.send(Outbound.java:25) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.messaging.LoggingOutbound.send(LoggingOutbound.java:30) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.messaging.LoggingOutbound.send(LoggingOutbound.java:12) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.messaging.Outbound.send(Outbound.java:25) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.consensus.shipping.RaftLogShipper.sendNewEntries(RaftLogShipper.java:396) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.consensus.shipping.RaftLogShipper.onNewEntries(RaftLogShipper.java:232) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.consensus.outcome.ShipCommand$NewEntries.applyTo(ShipCommand.java:150) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.consensus.shipping.RaftLogShippingManager.handleCommands(RaftLogShippingManager.java:127) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.consensus.RaftOutcomeApplier.handleLogShipping(RaftOutcomeApplier.java:136) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.consensus.RaftOutcomeApplier.handle(RaftOutcomeApplier.java:68) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.consensus.RaftMachine.handle(RaftMachine.java:172) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.state.RaftMessageApplier.handle(RaftMessageApplier.java:61) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.state.RaftMessageApplier.handle(RaftMessageApplier.java:28) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.consensus.RaftMessageMonitoringHandler.timeHandle(RaftMessageMonitoringHandler.java:51) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.consensus.RaftMessageMonitoringHandler.handle(RaftMessageMonitoringHandler.java:44) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.consensus.RaftMessageMonitoringHandler.handle(RaftMessageMonitoringHandler.java:18) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.core.batching.BatchingMessageHandler.lambda$run$1(BatchingMessageHandler.java:164) ~[neo4j-causal-clustering-4.4.5.jar:4.4.5]
at java.util.Optional.ifPresent(Optional.java:183) [?:?]
at com.neo4j.causalclustering.core.batching.BatchingMessageHandler.run(BatchingMessageHandler.java:161) [neo4j-causal-clustering-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.scheduling.LimitingScheduler$ReschedulingJob.call(LimitingScheduler.java:168) [neo4j-cluster-common-4.4.5.jar:4.4.5]
at com.neo4j.causalclustering.scheduling.LimitingScheduler$ReschedulingJob.call(LimitingScheduler.java:151) [neo4j-cluster-common-4.4.5.j

Neo4j

Neo4j memory issue