cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Standalone install from helm chart hangs/errors

Since we're having some problems with neo4j running in the causal cluster mode (as per existing github issues and slack conversations), we thought trying to run a single instance version of neo4j instead might resolve the issue.

Installation is onto an AWS EKS 1.14.7 cluster (btw, causal cluster using the same material starts up fine).

Installation done using:

helm install --name neo4j-single neo4j-vendor/neo4j --wait --set core.standalone=true --set acceptLicenseAgreement=yes --version 4.1.0-3
Error: release neo4j-single failed: timed out waiting for the condition

The problem appears to be the pod startup, it gets to this stage

$ kubectl logs -f neo4j-single-neo4j-core-0
Configuration override prefix = neo4j_single_neo4j_core_0
Starting Neo4j CORE 0 on neo4j-single-neo4j-core-0.neo4j-single-neo4j.default.svc.cluster.local
Warning: Folder mounted to "/data" is not writable from inside container. Changing folder owner to neo4j.
Changed password for user 'neo4j'.
Fetching versions.json for Plugin 'apoc' from https://neo4j-contrib.github.io/neo4j-apoc-procedures/versions.json
Installing Plugin 'apoc' from https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/4.1.0.1/apoc-4.1.0.1-all.jar to /plugins/apoc.jar

Second interesting observation is the discovery services created when looking at the chart status, there should be a single discovery service, if i look at the 3 node causal cluster deployment there are only 3 discovery services... and 3 is the default value for the .Values.core.numberOfServers so where the heck is it getting 5 from...

$ helm status neo4j-single
LAST DEPLOYED: Mon Jul 20 10:11:45 2020
NAMESPACE: default
STATUS: PENDING_INSTALL
RESOURCES:
==> v1/ConfigMap
NAME                              DATA  AGE
neo4j-single-init-script          1     3m32s
neo4j-single-neo4j-common-config  22    3m32s
neo4j-single-neo4j-core-config    1     3m32s
neo4j-single-test-script          1     3m32s
==> v1/Pod(related)
NAME                       READY  STATUS   RESTARTS  AGE
neo4j-single-neo4j-core-0  0/1    Running  0         3m32s
==> v1/Role
NAME                               AGE
neo4j-single-neo4j-service-reader  3m32s
==> v1/RoleBinding
NAME                                             AGE
neo4j-single-neo4j-sa-to-service-reader-binding  3m32s
==> v1/Secret
NAME                        TYPE    DATA  AGE
neo4j-single-neo4j-secrets  Opaque  1     3m32s
==> v1/Service
NAME                            TYPE       CLUSTER-IP  EXTERNAL-IP  PORT(S)                              AGE
discovery-neo4j-single-neo4j-0  ClusterIP  None        <none>       5000/TCP,6000/TCP,7000/TCP,3637/TCP  3m32s
discovery-neo4j-single-neo4j-1  ClusterIP  None        <none>       5000/TCP,6000/TCP,7000/TCP,3637/TCP  3m32s
discovery-neo4j-single-neo4j-2  ClusterIP  None        <none>       5000/TCP,6000/TCP,7000/TCP,3637/TCP  3m32s
discovery-neo4j-single-neo4j-3  ClusterIP  None        <none>       5000/TCP,6000/TCP,7000/TCP,3637/TCP  3m32s
discovery-neo4j-single-neo4j-4  ClusterIP  None        <none>       5000/TCP,6000/TCP,7000/TCP,3637/TCP  3m32s
neo4j-single-neo4j              ClusterIP  None        <none>       7474/TCP,7687/TCP,7473/TCP,6362/TCP  3m32s
==> v1/ServiceAccount
NAME                   SECRETS  AGE
neo4j-single-neo4j-sa  1        3m32s
==> v1/StatefulSet
NAME                     READY  AGE
neo4j-single-neo4j-core  0/1    3m32s

Chart version:

neo4j-single                    1               Mon Jul 20 10:11:45 2020        FAILED          neo4j-4.1.0-3                           4.1.0           default

We get the exact same behaviour with chart version 4.1.0-2

1 REPLY 1

Something really odd happening here, had to end up scaling the kubernetes cluster and after that an install worked. Some useful logging would be more appropriate. The pod got scheduled, there were sufficient resources available, there were no NodePorts conflicting. Very odd.