cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Neo4j Not able to form casual cluster - attempting to connect

For the last few hours, I have been trying to get our casual cluster up and running. Core nodes aren't starting, they are stuck at the following line -

Attempting to connect to the other cluster members before continuing...

These are our settings in the config file -

causal_clustering.cluster_allow_reads_on_followers=false
causal_clustering.minimum_core_cluster_size_at_formation=3
causal_clustering.discovery_type=LIST
causal_clustering.initial_discovery_members=172.x.x.x:5000,172.x.x.x:5000,172.x.x.x:5000

The debug logs say that all the nodes are in communication with each other. Master is elected too, but I see the following log in both non master nodes, which I think could be causing the problem -

2019-07-16 20:29:29.112+0000 DEBUG [com.hazelcast.internal.partition.InternalPartitionService] [172.x.x.x]:5000 [dev] [3.7.5] Master version should be greater than ours! Local version: 1129, Master version: 1129 Master: [172.x.x.x]:5000

We are running on 3.4.0 version of Neo4j. I can provide any other missing info if necessary.

Is there anyway for me to go about understanding what is the exact problem here and how to start the core servers?, Please help.

2 REPLIES 2

Provided that the data is the same on all nodes of the cluster, you may need to shut down all nodes and run neo4j-admin unbind on each of them, then start up the nodes. More than likely old cluster state is causing some issues.

Also, try to get off of 3.4.0 when possible, as the .0 releases tend to include more bugs than usual, coming along with the first wave of new features for the new minor version. The latest patch on the 3.4.x line is 3.4.14, so you might want to upgrade to that.

I have stopped all the 3 nodes and executed unbind command and started the nodes again, that did not help. Is there any way that we can check if the data is same on all nodes to begin with, and any other pointers in to how to go about debugging this issue further? Thank you.