Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
10-31-2020 10:03 AM
I am using helm charts to create a cluster using values with below
acceptLicenseAgreement: "yes"
neo4jPassword: "mysecret"
core:
standalone: false
numberOfServers: 2
persistentVolume:
## whether or not persistence is enabled
##
enabled: true## core server data Persistent Volume mount root path ## mountPath: /data ## core server data Persistent Volume size ## size: 250Mi
discoveryService:
type: ClusterIP
annotations: {}
labels: {}
loadBalancerSourceRanges:
# Controls how many services get created. Usually want to over-provision so cores can
# scale up for things like rolling upgrades.
instances: [0, 1]
standaloneOnly: [0]readReplica:
numberOfServers: 0
I could see that the kubernetes services are created service/discovery-neo4j-neo4j-0, service/discovery-neo4j-neo4j-1 and using the ports 5000/TCP,6000/TCP,7000/TCP,3637/TCP.
The pods pod/neo4j-neo4j-core-0 and pod/neo4j-neo4j-core-1 are not running and waiting with the message
2020-10-31 16:43:19.091+0000 INFO Database 'system' is waiting for a total of 3 core members...
I checked the neo4j conf file of this pod and pasted below...
causal_clustering.transaction_advertised_address=discovery-neo4j-neo4j-0.dev-namespace.svc.cluster.local:6000
causal_clustering.raft_advertised_address=discovery-neo4j-neo4j-0.dev-namespace.svc.cluster.local:7000
causal_clustering.minimum_core_cluster_size_at_runtime=2
causal_clustering.minimum_core_cluster_size_at_formation=3
causal_clustering.kubernetes.service_port_name=tcp-discovery
causal_clustering.kubernetes.label_selector=neo4j.com/cluster=neo4j-neo4j,neo4j.com/role=CORE,neo4j.com/coreindex in (0, 1, 2)
causal_clustering.discovery_type=K8S
causal_clustering.discovery_advertised_address=discovery-neo4j-neo4j-0.dev-namespace.svc.cluster.local:5000
Any idea why the pods are not getting started and could not resolve the service?
Solved! Go to Solution.
10-31-2020 07:59 PM
Isn't that the same single pod?
It's tough to diagnose this, but by chance have you tried deploying multiple times? E.g. why is your service called discovery-neo4j-neo4j-2
, was there a discovery-neo4j-neo4j-0
and/or discovery-neo4j-neo4j-1
?
Separately, I would increase your number of servers to 3
, to give you at least some fault tolerance.
10-31-2020 10:20 AM
In the log, I could see that service is not reachable though it is running and listening on the port
2020-10-31 15:37:31.442+0000 WARN [a.s.Materializer] [outbound connection to [akka://cc-discovery-actor-system@discovery-neo4j-neo4j-2.dev-namespace.svc.cluster.local:5000], message stream] Upstream failed,
cause: StreamTcpException: Tcp command [Connect(discovery-neo4j-neo4j-2.dev-namespace.svc.cluster.local:5000,None,List(),Some(10000 milliseconds),true)] failed because of java.net.UnknownHostException: dis
covery-neo4j-neo4j-2.dev-namespace.svc.cluster.local
10-31-2020 10:27 AM
The service details are
kubectl describe service/discovery-neo4j-neo4j-2
Name: discovery-neo4j-neo4j-2
Namespace: dev-namespace
Labels: app.kubernetes.io/component=core
app.kubernetes.io/instance=neo4j
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=neo4j
helm.sh/chart=neo4j-4.1.3-1
neo4j.com/bolt=false
neo4j.com/cluster=neo4j-neo4j
neo4j.com/coreindex=2
neo4j.com/http=false
neo4j.com/role=CORE
Annotations: meta.helm.sh/release-name: neo4j
meta.helm.sh/release-namespace: dev-namespace
Selector: statefulset.kubernetes.io/pod-name=neo4j-neo4j-core-2
Type: ClusterIP
IP: None
Port: tcp-discovery 5000/TCP
TargetPort: 5000/TCP
Endpoints: <none>
Port: tcp-transaction 6000/TCP
TargetPort: 6000/TCP
Endpoints: <none>
Port: tcp-raft 7000/TCP
TargetPort: 7000/TCP
Endpoints: <none>
Port: tcp-jmx 3637/TCP
TargetPort: 3637/TCP
Endpoints: <none>
Session Affinity: None
Events: <none>
10-31-2020 07:59 PM
Isn't that the same single pod?
It's tough to diagnose this, but by chance have you tried deploying multiple times? E.g. why is your service called discovery-neo4j-neo4j-2
, was there a discovery-neo4j-neo4j-0
and/or discovery-neo4j-neo4j-1
?
Separately, I would increase your number of servers to 3
, to give you at least some fault tolerance.
11-01-2020 07:12 AM
It was a typo of getting pod/neo4j-neo4j-core-0 repeated twice.
Also I increased the numberOfServers
to 3 and it all starting working now. Thank you for the suggestion. I marked the reply as Solution
06-09-2021 06:46 AM
I think the cause is: one of the common-configmap.yaml values is NEO4J_causal__clustering_minimum__core__cluster__size__at__formation: "3"
In case you want to keep number of servers to 2. e.g. in a development environment then try this.
apiVersion: v1
kind: ConfigMap
metadata:
name: neo4j-cm
namespace: neo4j
data:
NEO4J_causal__clustering_minimum__core__cluster__size__at__formation: "2"
and
values:
envFrom:
- configMapRef:
name: neo4j-cm
core:
numberOfServers: 2
All the sessions of the conference are now available online