It’s been around 2 years since tombola started to investigate the use and potential benefits of container technology, specifically Docker. At times since then progress has been slow and frustrating, but that’s no shock with new technologies.
We’ve been running containers in a production environment for just over a year now and I wanted to look back at how this move has affected the business.
N.B. This isn’t going to be overly technical, just a general review.
The team started by converting two API services to run in containers. The work not only required rewriting these services from .Net to NodeJS, but also to creat a new solution to build and deploy them. A combination of Team City (which we already used) and Terraform (more on that choice later) was what was settled upon.
To run and manage our containers we chose to use Amazon’s Elastic Container Service. This provides an orchestration tool to manage a cluster of instances and a private container image repository. At the time it was the easiest solution that met our needs.
So, why Terraform? Two words – Placement strategies. We needed to set placement strategies for the containers within the ECS cluster. These needed to be set upon deployment of the application. These strategies could be set through the AWS console, but not using Cloudformation at the time, which was what we originally wanted to use. However, Terraform did support this, and so the choice was pretty simple. Setting the placement strategies was an important part to maintaining a highly available service.
Since the migration of those 2 initial services the teams have added almost a dozen more. All of these services are deployed using the same pipeline mentioned earlier, which has simplified and sped up the process of getting new services live.
The move has simplified some of the server management in Operations. At least 8 of the services would previously have been built to run on their own stacks of 2-4 instances, leaving us with at least 16 instances running. Instead we have 3 in our main cluster, with room to spare. This saves us money, time and effort.
Since the services have gone live they have been rock solid. Even patching of the live cluster instances is non-disruptive.
Has it all been sunshine and rainbows? Not exactly. There have been technical challenges along the way, but nothing that hasn’t been overcome (eventually). You can’t really argue against the success of this work.
Where do we go from here?
There’s always work to be done and there’s no exception here. Recently AWS released their managed Kubernetes service, which warrants investigating. Is it a better solution for us than ECS? We’ll have to find out.
Monitoring has always been a bit of a difficulty, but we will soon be implementing a new monitoring solution across all of our infrastructure that will handle container monitoring far better than our current solution.
Security is a constant battle and everyone using containers can step up here. Visiting Dockercon this year opened my eyes to just how much more work is required from a security point of view across the community. For example, seeing a major distro’s official docker image being exploited quickly was a little scary. Container security definitely presents a new set of challenges.
So there is plenty more work to be done and maybe next time I’ll be writing about a migration to Kubernetes, who knows?