cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Casual-Cluster with Signed certs - cluster driver not fully operational

uday
Node Link

I have a core-only cluster with 3 neo4j servers that are part of a sub-domain.

Core1 - Has 2 Network Interfaces - Public IP is the listener address
Core2 - Has 2 Network Interfaces - Public IP is the listener address
Core3 - Has 2 Network Interfaces - Public IP is the listener address

DNS

  1. <subdomain>.mydomain.io has three A Records that point to Core1, Core2, Core3 's IP Addresses listed and this is the same used for RAFT as well. I see a lot of success information in logs as well as query info in query.log
  2. I do not use DNS names inside the neo4j.conf and instead the public IPs.

Architecture

  1. It's a grandstack app. apollo-graphql and react talk to each other, Graphql talks to a SINGLE database.
  2. The core-only cluster will replace the SINGLE database when ready.

Certs:
I have used Letsencrypt certbot to generate my certs i.e. publicly signed for all my cores by their IP Addresses. Each server has it's own certbot generated cert (DNS ACME Challenge).

Currently:
The Cluster seems to be working. I ran a portion of my seeding and everything on the cluster looks peachy (well almost). This is not a production cluster today but in a week I want to use this as my main neo4j instances in a specific region.

Problem:

  • I started having difficulties with neo4j+s driver on both GraphQL as well as on Neo4J Desktop.
  • I cannot connect directly to my cluster unless the 2. in Temporary Steps is done.

Temporary-steps:

  1. I have a trust: 'TRUST_ALL_CERTIFICATES', for my GraphQL driver.
  2. I connect to botl+s://<subdomain>.mydomain.io on neo4j desktop.
  • After connecting I do :server disconnect and then can use neo4j+s://<subdomain>.mydomain.io

I want to fix my cluster and and my graphql as well. I cannot use the TRUST_ALL_CERTIFICATES as this is completely counter-intuitive to having publicly-signed certs.

How do I debug? How do I proceed forward? This is probably my last step before I can finalise my cluster and move to production.

Cheers,
Uday

1 ACCEPTED SOLUTION

I found my answer by reading this article on medium written by David Allen.

While the post was greatly insightful, I had to fix my bolt advertised addresses

Step1:

CALL dbms.cluster.routing.getRoutingTable({}) 
YIELD ttl, servers 
UNWIND servers as server
RETURN ttl, server.role, server.addresses;

Step2:

Ensure that  `dbms.connector.bolt.advertised_addres=<DNS_FOR_CLUSTER>`

In my case, the DNS A records have three IP Addresses - each my CORE CLUSTER. Phew! It was a simple thing, Glad I could figure it out myself.

View solution in original post

2 REPLIES 2

uday
Node Link

Adding more useful info. I see this in the logs when I attempt to use neo4j+s from neo4j desktop.

io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:478) ~[netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) ~[netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.1.55.Final.jar:4.1.55.Final] at org.neo4j.bolt.transport.pipeline.AuthenticationTimeoutTracker.channelRead(AuthenticationTimeoutTracker.java:45) [neo4j-bolt-4.2.3.jar:4.2.3] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-all-4.1.55.Final.jar:4.1.55.Final] at java.lang.Thread.run(Thread.java:834) [?:?] Caused by: javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?] at sun.security.ssl.Alert.createSSLException(Alert.java:117) ~[?:?] at sun.security.ssl.TransportContext.fatal(TransportContext.java:336) ~[?:?] at sun.security.ssl.Alert$AlertConsumer.consume(Alert.java:293) ~[?:?] at sun.security.ssl.TransportContext.dispatch(TransportContext.java:185) ~[?:?] at sun.security.ssl.SSLTransport.decode(SSLTransport.java:171) ~[?:?] at sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:681) ~[?:?] at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:636) ~[?:?] at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:454) ~[?:?] at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:433) ~[?:?] at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:637) ~[?:?] at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:282) ~[netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1387) ~[netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1282) ~[netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1329) ~[netty-all-4.1.55.Final.jar:4.1.55.Final] at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508) ~[netty-all-4.1.55.Final.jar:4.1.55.Final]

I found my answer by reading this article on medium written by David Allen.

While the post was greatly insightful, I had to fix my bolt advertised addresses

Step1:

CALL dbms.cluster.routing.getRoutingTable({}) 
YIELD ttl, servers 
UNWIND servers as server
RETURN ttl, server.role, server.addresses;

Step2:

Ensure that  `dbms.connector.bolt.advertised_addres=<DNS_FOR_CLUSTER>`

In my case, the DNS A records have three IP Addresses - each my CORE CLUSTER. Phew! It was a simple thing, Glad I could figure it out myself.