Neo4j

danjou_philippe · ‎10-20-2020

So after issues I finally got a 3 node causal cluster v 4.1.3 but I can't execute any commands on it.
I always get: Failed to obtain connection towards write server. Known routing table is: RoutingTable[database=neo4j, expirationTime=1603208090805, currentTime=1603207790852, routers=[10.4.0.102:7687,10.4.0.101:7687], readers=[10.4.0.101:7687,10.4.0.102:7687], writers=]

Despite:
|system|10.4.0.101:7687|follower|online|-|-|
|system|10.4.0.102:7687|follower|online|-|-|
|system|10.4.0.100:7687|leader|online|-|-|

All 3 have identical config except the IPs obviously. All 3 are CORE nodes so I dont understand why it says 0 writers? they all should accept writes?
No Firewalls, raw LAN nodes. Weird is that I dont see 7687 in netstat as active port on any node?

david_allen · ‎10-20-2020

Your routing table contains addresses that begin with a reserved IP prefix (10.*). This means generally those are only internally valid IP addresses.

My best guess would be that your problem is that the client (neo4j browser, or an application) is running outside of that virtual network, and is trying (and failing) to route traffic to your database cluster members, because the addresses (10.*) are not routable on the external internet.

The solution is to adjust the advertised address setting in neo4j.conf for each cluster member, to advertise an externally valid IP address so that your clients can connect.

This is a common problem if you deploy Neo4j running in a cloud, if you haven't configured external network access.

danjou_philippe · ‎10-20-2020

Hi, this is via VPN and hence internal access and I can reach those nodes just fine because they dont only run neo4j. This is no access problem caused by routing.
Also I'm confused why I see this port 7687 being asked for but it is not exposed by java at all, only the cluster ports 5000/6000/7000 are. How is this supposed to work then? And yes the config has it enabled. see on each node (by the way NONE of this is part of your "how to setup a causal cluster" in your documentation, which seems to lack big time)

dbms.connector.bolt.enabled=true
#dbms.connector.bolt.tls_level=DISABLED
dbms.connector.bolt.listen_address=10.4.0.xxx:7687
#dbms.connector.bolt.advertised_address=:7687

But only one the first node I see 7687 exposed?

tcp6       0      0 10.4.0.100:6000         :::*                    LISTEN      332453/java         
tcp6       0      0 10.4.0.100:7474         :::*                    LISTEN      332453/java         
tcp6       0      0 10.4.0.100:7000         :::*                    LISTEN      332453/java         
tcp6       0      0 :::6362                 :::*                    LISTEN      332453/java         
tcp6       0      0 10.4.0.100:7687         :::*                    LISTEN      332453/java         
tcp6       0      0 10.4.0.100:5000         :::*                    LISTEN      332453/java

on the other 2 nodes I only see this:

tcp6       0      0 10.4.0.102:6000         :::*                    LISTEN      306476/java         
tcp6       0      0 10.4.0.102:7000         :::*                    LISTEN      306476/java         
tcp6       0      0 :::6362                 :::*                    LISTEN      306476/java         
tcp6       0      0 10.4.0.102:5000         :::*                    LISTEN      306476/java

danjou_philippe · ‎10-25-2020

solved by running unbind on all nodes again

Neo4j

Failed to obtain connection towards write server but only CORE nodes