cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Neo4J HA Docker Cluster on ECS with Cloudformation Reference?

arniesaha
Node Clone

Hi Folks,

I've been developing a product with Neo4j over the last several months for which a single instance of Neo4j docker instance has sufficed my requirement.

I'm soon moving to production and I'm attempting to deploy a Neo4J docker container cluster on Amazon ECS with Cloudformation, and have been running into several issues.

Also, the reason why I also don't want to deploy the cluster AMI available on Amazon Marketplace is because it doesn't have availability in my required region. i.e ap-south-1

I've referenced something that looks like this: https://github.com/arniesaha/neo4j-aws-ha-cluster

While, the docker image is generated and deployed on ECR as expected. I'm having issues deploying the cloudformation.yml using the image and either the Tasks under the service would crash or the deployment times-out and rolls back.

Details of the issue here: https://github.com/getsocial-rnd/neo4j-aws-ha-cluster/issues/1

Also, I would like to deploy with my existing volume from the single instance from my development along with the geospatial plugin.

Would there be a reference docker HA cluster implementation of Neo4j available with anyone?

Would highly appreciate any assistance with this!

Thanks,
Arnab

28 REPLIES 28

It's OK if you'd prefer to use docker images. However if the issue is that you can't use the AMI because it's not available in your required region, it's possible to copy AMIs between regions so that it would be for you.

https://scrapoxy.readthedocs.io/en/master/standard/providers/awsec2/copy_ami_to_region/

I'm personally not familiar with this repo that you've forked as I haven't been using ECS. And it looks like the issue you're experiencing is buried pretty deep in the AWS specific bits. There is a reference docker HA cluster implementation of Neo4j that you can find running inside of kubernetes here: https://github.com/neo-technology/neo4j-google-k8s-marketplace/blob/3.5/user-guide/USER-GUIDE.md

But the stack this uses is quite a bit different than what you're doing here with AWS and ECR.

You have a couple of options, depending on what you need that I can see. If AWS + ECR is best for you, then working with the maintainer of the repo you've forked is probably best. If you just need a cluster up and running quickly, Neo4j provides in the marketplace for all 3 major clouds an option to get you started. Most are VM based, but above I've linked a docker/kubernetes based approach as well.

Hi David,

Thanks for the reply!

While, I'd prefer a docker cluster on AWS. But, even a VM approach is fine at the moment in interest of time to go live. Its easier to manage sheeps than pets 😉

But, I went ahead and gave this a try: https://aws.amazon.com/marketplace/pp/B07D441G55

Which has a cloudformation template and service catalog options.

I chose the cloudformation method in a available zone, and tried to copy the available AMI to my required region. But, I run into this

Basically, it doesn't let me copy it.

And I'm not that well versed with GCP and Kubernetes and would like to keep our implementation on AWS due to business requirements.

Any more ideas?

Thanks,
Arnab

Your best bet if you want to stick with Amazon is to use the marketplace entry:

This should require very little setup on your part.

https://aws.amazon.com/marketplace/pp/B07D441G55?qid=1550166698292&sr=0-2&ref_=srh_res_product_title

Unfortunately I can't provide support for the github repo you're using because I'm simply not familiar with it.

Hi David!

Sure, I understand. And likewise I mentioned I tried copying the official AMI to my required region and Amazon isn't allowing me to do so, like in the screenshot.

Will it be possible to make the image available in ap-south-1 region?

Regards,
Arnab

Give me the neo4j version number you're trying to run, or the AMI ID of what you tried to copy and I can get you one in ap-south-1.

arniesaha
Node Clone

Hi David,

AMI ID: ami-0ae3b1104eed0d04c (available in ap-southeast-1)
Either 3.5.1 or 3.4.9 should work.

Thanks again,
Arnab

AMI ami-0841505f29ee8c75f is neo4j 3.5.1 enterprise in ap-southeast-1, and should be available.

arniesaha
Node Clone

Thanks!

You mean ami-0841505f29ee8c75f should be available in ap-south-1 (Mumbai), right?

Cause it's probably still not available right now when I checked

Shoot. Sorry about that. It's really easy for me to get the zone designators (ap-south-1) and the geo designators (Singapore) mixed up. In my previous message I got the AMI ID right but had copied it to the wrong region.

Try AMI ami-0284a2c822c6c3b9e

arniesaha
Node Clone

No worries! Thanks

The new AMI ami-0284a2c822c6c3b9e still doesn't reflect under ap-south-1. Perhaps, it takes a while to update on the marketplace?

I'll check back again in sometime and report back.

Please post a screenshot of what you're seeing.

This is what I'm seeing.

arniesaha
Node Clone

This is what I see.

In your screenshot the visibility is set to private. Maybe you need to make it public for me to get access?

Thanks,
Arnab

Arg, I have clearly not had enough coffee yet today!

It's just been made public.

arniesaha
Node Clone

Haha, I can understand.

I can see it now!

Thanks!

arniesaha
Node Clone

So I ran into another issue. I tried to use the ami mapping of ap-south-1 with the available cloudformation template available with https://aws.amazon.com/marketplace/pp/B07D441G55

Since, ap-south-1 doesn't have a third availability zone like the other region, the template fails for subnet-3 e.g. that could be ap-southeast-1c

I tried some edits with the template but it might take longer to get it all wired up with 2AZs.

So, again in interest of time I wanted to launch the VMs to evaluate in ap-southeast-1

Everything went through fine. And I see the 3 nodes of EC2 created

And see a private VPC and couple of entries under route 53

The question is how can I access the neo4j browser? Don't see any load balancers created for the VPC or the Domain enteries?

Thanks,
Arnab

No load balancers are created or necessary.

You can access neo4j browser on port 7473 of the public IP of any of the three machines you created.

Yes -- we usually deploy the AMIs by default to regions with a minimum of 3 AZs so that we can round-robin the machines around the AZs for high availability.

arniesaha
Node Clone

Got it.

So i tried accessing the 7473 on one of the deployed machines. e.g. 54.255.154.98 (in the screenshot below)

And the security rules like below

But the page doesn't seem come up

Anything im getting wrong with this?

Thanks,
Arnab

This should be working, but it's hard to tell why it isn't from this. Did you make other changes to the CloudFormation that could affect routing, for example with internet gateways, subnets, etc?

If you ssh into the machine, can you from that machine access the service (cypher-shell -u neo4j -p password -a localhost)?

What does systemctl status neo4j say on the machine, and are there errors in debug.log?

arniesaha
Node Clone

Didn't change anything with the template. Infact launched it directly from the marketplace listing itself.

Cypher shell command gives the above error.

While, systemctl status neo4j

Infact, also tried launching it another zone but running into similar behaviour. Will try once more though.

If you have any ideas do let me know.

Thanks,
Arnab

FYI, here are the debug logs as well. Don't see any errors as such. But, there are several warnings.

debug logs

Is there a minimum VM type I should be choosing, I deployed this on m4.large.

Should that be the issue?

EDIT: I did try running a large VM type, the behavior is still the same as reported above.

VM size here won't make a difference, this only really impacts performance based on your graph size.

The login screenshot shows that neo4j is running and accepting connections (although you used the wrong password). Presumably you can log in using the correct password? This means that the broken web page on browser is more likely to be about some network setting in between.

Because we needed to get you a new image, you must have had to make changes to the cloudformation because as discussed above you didn't have 3 AZs in your region, correct? I would look back over those changes to see what else might have happened.

arniesaha
Node Clone

Oops, yes I can login to the shell with correct password.

Well, in this case I actually ran it in ap-southeast-1 (Singapore) region without any change to the template directly from the listing: https://aws.amazon.com/marketplace/pp/B07D441G55

And, I'm unable to access via the < public-ip >:7473 via on neither of the 3 EC2's. But, I can ssh and login into the shell.

I'll have to make changes to the template for my needed zone, but need to figure out what exact changes would be needed for that. If any pointers on this will be great too?

But, even if I can have the existing template work on ap-southeast-1 should work for now.

I will try to reproduce this issue in ap-southeast-1 from the marketplace listing and investigate later today.

I've just tried this, and the deploy went fine from the marketplace in ap-southeast-1, and I was able to connect from the outside with my browser, so I can't seem to reproduce the issue.

Can you check your CloudFormation resources and see if there are any errors or warnings?

Another thing you might check is the /etc/neo4j/neo4j.conf file on any of the three machines, make sure it has an advertised_address that matches the external IP of the machine.

arniesaha
Node Clone

Got it.

I have access too. But, port 7473 on any of the public ip isn't accessible. Only the one listed in the Output tab is:

In my case it was the Node1's ip, I was trying to access via Node0's ip.

Works for now, thanks again for your assistance.

Will be trying to work on the template for my region, but I think I will go with ap-southeast-1 for the moment.

Arnab

arniesaha
Node Clone

FYI,

If anyone is interested in deploying this in ap-south-1 (Mumbai) region, I've reworked the CloudFormation template to be able to so with ami-0284a2c822c6c3b9e

https://s3.ap-south-1.amazonaws.com/cf-templates-1wqj6b0ycugib-ap-south-1/neovm-2az.yml

Could you describe quickly what were the major changes you made?

arniesaha
Node Clone

Sure.

The major change was to remove 'Subnet 1' as there are only two availability zones in this region. So, I moved the resources under Subnet 1 i.e One core server, and two read replicas into Subnet 0, ensuring all the other routing stays intact.

This helped to achieve the same 3 core VM but with 2 subnets instead of 3. Performance wise I'm yet to experience if there is any issues. But, since, this is closest to my application users the latency would definitely be the least.