Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
04-16-2020 05:00 AM
Hi there, I'm launching Neo4J Enterprise on Google Cloud Platform. At the time of this writing, the current version of neo4j is 4.0.2.
That being said, when I launch a cluster in GCP the cluster successfully is created. Great.
Then when I go to IP address associated with the cluster to login to ensure the cluster is operational, a host of issues immediately arise.
How has this effected us?
I've lost days debugging these issues. I dont have days to lose: I'm in a time sensitive project.
How should it be different?
I expect Neo4j Enterprise to work, right out of the box. You deploy a cluster. Upon successful cluster creation, you go to the designated cluster entry point, auth in, and bam, you're in and everything works.
Instead, the current experience is after successful cluster creation, you cannot auth in, and you lose days trying to figure things out, leaving you frustrated enough to write in this forum. lol
Steps I've taken to resolve the issue:
Summary: it looks like the web socket issue isnt related to the a valid SSL cert being present, its more like it has to do with security setting neo4j.conf. I'm not sure which security setting its related to. You can disable security and hit HTTP to get around the issue, but thats a security flaw, not a solution for an enterprise setting.
The client is unauthorized due to authentication failure
(even though auth=false and it previously just allowed me in with the same user / pw). Now I can no longer auth into the DB, FML again.I'm about to destroy this cluster, and going to try repeat this insane set of steps to try to resolve these issues. I'd sincerely appreciate help from anyone who knows how to solve any of these issues
To the fine folks @ Neo4j:
Look, I'm no noob...I've built two graphs with billions of nodes each, operating at web scale, but I gotta say, this experience seems whack, even to a senior software engineer. lol
I expect more from Neo4j Enterprise. I expect it work out of the box. I cant tell you how many times I've torn down this cluster, recreated a new one trying to resolve these little bugs and just trying to get v4 to work. I, and I'm sure the community agrees, expect there to be no websocket issue. If there is a prerequisite step that your engineers are aware of that's needed to resolve the issue, then a patch should be released or the solution should be baked into deployment script (ie, if the websocket issue was caused by missing SSL certs, then when making a cluster, drag and drop your SSL cert here to configure your cluster).
Btw, v3.5 worked beautifully right out of the box. No issues. Its just 4.0 that has these issues. I'm losing days on this...I'm about to lose another day...
Can you answer these questions:
Thanks for being as patient with this topic as I've been with the issue
04-16-2020 05:33 AM
Let's try to tackle your three questions:
How do you solve the websocket connection issue for neo4j 4.0?
You are right, they are generally related to self-signed SSL certificates (SSCs). The browser usually throws up a scare warning about these, but then also the Neo4j driver under the covers really doesn't like to accept them by default, for security reasons. That's what I think is happening: you get to the browser page, and then fail to login with the websocket error because the driver that Browser is using won't accept self-signed certificates.
The solution to this one is going to be to get valid signed certificates, i.e. with LetsEncrypt. Instructions are here, but yes beware -- I haven't updated this blog post for 4.0 settings, so you might have to cross-check differences in the settings. https://medium.com/neo4j/getting-certificates-for-neo4j-with-letsencrypt-a8d05c415bbd
Are there additional steps required to make a GCP Cloud Deployment work? Why doesn't initial username / initial password work after sometime? Are you supposed to change it?
It should work out of the box. I'll take up the issue with our drivers team and browser team to see if we can get defaults published that will work with SSCs. The trouble here is that in a self-deployed cloud scenario, you have to start with SSCs.
The initial username and password does work, and you don't have to change it (we automatically generate a secure one for you) - I think you're not getting as far as the username/password check, because of the SSC issue.
RE: Follower vs leader issue: who am I supposed to be asking to do a write?
Short answer: to do a write you have to talk to the leader. Longer answer: Neo4j uses a smart client routing approach, and has a cluster architecture where different machines in the cluster adopt different "roles" with respect to the data. This means that your client has to know where to route writes, and that's always to the leader.
Really long answer if you want to know how the guts work: https://medium.com/neo4j/querying-neo4j-clusters-7d6fde75b5b4
04-16-2020 05:48 AM
Cross-post from another thread discussing the same thing: Websocket connection failed - possible certificate chain issue
04-16-2020 08:52 AM
I want to add some follow-up and fix instructions. Here's a config snippet with some relevant things:
dbms.ssl.policy.bolt.enabled=true
dbms.ssl.policy.bolt.client_auth=NONE
dbms.ssl.policy.https.enabled=true
dbms.ssl.policy.https.client_auth=NONE
dbms.connector.https.enabled=true
dbms.connector.bolt.tls_level=REQUIRED
Two things to get this working with SSCs, that is without getting signed certs.
To accomplish this, first make sure you've disabled client auth. Then, visit https://myhost:7687 -- we don't really care what this page has (in fact the page will be broken because HTTPS isn't bolt) -- but it will prompt Chrome to get you to accept the cert on this port. Once that's done, you should be able to log in with an SSC using HTTPS. This time the login will succeed, because Chrome trusts the same cert on port 7687. Browser will make that connection, and it will work.
Hope this helps.
04-16-2020 02:46 PM
@david.allen you rock for answering this! I haven't tried the snippets you provided yet.
Instead, I created a new cluster, and cannot auth into it given the initial password / user.
Screenshot of the successful deploy on GCP.
try to login at:
(note: this cluster is demonstration purposes, so its okay to share the creds in this circumstance)
you should see this unauthorized error:
PS: web socket issue seems to have went away with chrome 81? Looks like Firefox still has it. I'll confirm on the next cluster I create if web socket issue went away with the Chrome update
@david.allen can you go to the URL confirm if you're able to auth in with these creds? Maybe a bug with the current deploy script? If you can auth in, please let me know too.
Thank you. In the mean time, I'll spin up another cluster
(using latest image released today 4/16/2020: neo4j version is 4.0.3)
04-16-2020 03:07 PM
Sorry - for various security sensitivity reasons, I'm not willing to log into machines that folks stand up. Please have a look at the previous guidance I posted in detail -- and please do avoid posting login credentials (even if temporary).
04-16-2020 04:00 PM
Cool, you dont have to use my cluster to replicate the bug. Just launch a new GCP cluster right now and observe the results, following the steps I outlined above.
It looks like the neo4j team deployed v4.0.3 to GCP today (4/16/20) at 12:31 PM. New clusters are not allowing developers login. To validate this is an ongoing issue, I've created 2 more clusters outside the test cluster I shared. Every new cluster is reporting the initial username / initial password is invalid.
Please take 5 mins to launch a new cluster on GCP, go to the URL provided, submit that cluster's initial username and password and observe the results. Please let me know if you're able to replicate the issue on your end
PS to other people searching for bug fixes:
Just to point out: this is not an issue in v3.5, nor did I observe this most recent issue with v4.0.2. This is occurring on the v4.0.3 image on GCP released today. So if anyone wants something that works, I cannot recommend 4.0, due to these numerous issues you're seeing discussed in this post, but I can say version 3.5 worked out of the box. I may drop hope for v4.0 and revert to v3.5.
Whats a developer got to do to get horizontal scaling to work on a graph DB?
04-16-2020 08:21 PM
@david.allen, my teammate and fellow developer figured out the issue.
All new clusters are setting username to neo4j
and password to neo4j
...instead of the initial cluster password. Hence if you use the initial password, you get an auth error. Yet if you use neo4j
as the password, you're in the door.
This should be considered a security vulnerability. Please release a patch and fix.
This issue should only effect any new clusters made on neo4j at the time of this writing.
Now that we resolved that issue, moving onto to fix the other issues. More to come
04-17-2020 03:22 PM
Thanks for hanging in there 😉 your linux username is permanetly in our VMs...literally..and for legendary purposes too
Also, websocket issue went away with Chrome 81. Still exists for FF, IE, etc.
We can now successfully auth in and populate the cluster. One last question...is horizontal scaling enabled by default or is there a switch we need to flip?
Thanks again
04-20-2020 12:44 PM
@NawarA A patched 4.0.3 is now available and you can login using the Initial password displayed.
Regarding your question, if you want to scale reads you have to deploy more Read Replicas.
If you want to scale writes, adding more hardware to the leader can help, you cannot do that by horizontal scaling,
All the sessions of the conference are now available online