So Docker is buzzword. If you’ve been in the industry long enough you know how dangerous buzzwords can be, not because there’s not actual value in what they promote, but the risk of investing time in something that is either not ready or not suitable for what you do is extremely high, when everybody keeps telling you it’s impossible to live without it anymore.
The other problem with buzzwords is they are rarely accurate. When you dare to ask “can I do this” the fanboys will reply something like “Are you deaf or stupid? Of course you can!” but no one is really interested in exposing if it’s a good idea or not, if it requires other tools or not. Not because they’re not in good faith, but because when buzzwords spread, you have a higher chance to meet people who tried it than people who actually used it.
It’s because of this kind of conversations my first encounter with Docker was disastrous.
The versions on what Docker was were three:
- The best developers’ companion possible when you work with multiple softwares like databases and such
- The best imaginable production deployment solution that takes care of everything your application needs
- A new virtualization paradigm that allows you to finally stop thinking about servers
SPOILER ALERT: the only one to be totally true is (1).
Not so sad truths
So as I said 2 out of 3 expectations failed miserably. And that’s what tricked me in saying “I hate this crap” when I first approached the topic: wrong expectations.
Part of these expectations are the result of the fanboys advertisement of their new toy, as I said. But the other part is definitely my fault.
Remember the excitement when you first realized how easy it was to deploy a new server on Amazon AWS? Manage the networking, the volumes etc. That excitement was etched into my mind forever.
Now what happened with Docker is when I first heard bits of information about it I started imagining how it would have been having an “Amazon” where you don’t need to manage servers at all. A system where the virtual server environment is actually a commodity of applications. A system that “knows about applications rather than servers” and takes care of availability, virtual networking, virtual volumes, and configuration.
Here’s the thing. I imagined Docker to be something that it’s not, and the more I tried to find the way to make it work as I expected it should have behaved, the more I felt miserable and believed it to be faulty.
What’s really sad about this story is if I reread now how Docker has been described to me, I realize most of the problem was my hopes. I read what I wanted to read, I understood what I wanted to understand.
What is Docker
This paragraph would have helped me back then. I’m not going to dot-list every feature because more than anything else, what you need to understand is the philosophy that lies within.
Docker is a virtualization platform where the virtualized OS is not actually running, but it’s more of an environment that allows one application to run “as if” it was in a standalone context.
There’s no boot, there’s no init, no drivers or virtual hardware. With that said, though, the OS is actually functional, meaning it mimics real operating systems, and all the commands, files and utilities are where they have to be. Ideally, though, the sole purpose of the OS is making that one application run.
By doing so, the paradigm shift is the OS environment becomes part of the application itself, so it will be “running” as long as the application runs, and stop when the application stops. The environment can be modified and updated based on the needs, but when running it has to be considered stateless and any change you or the application will do to the environment is meant to vanish when the container dies. That’s perfectly coherent with this paradigm, because very rarely an application will modify itself, and the OS environment is part of the application.
Wording: an application with its surrounding environment is called an image. A running application is a container.
And there’s literally nothing else to say about the intimate nature of Docker, but there definitely are a lot of questions to be answered on how it will help you, so read on.
Docker allows the container to communicate with the outer world by providing network virtualization so that each container can be provided with one or more subnets and IP addresses.
There are various network modes and we’re not going into the details now, so accept this as a hint to get you started.
The virtual network will allow containers to talk to each other via the assigned IP addresses, so the application is clueless on whether it’s running next to its buddies, or thousand miles away. Each container is perceived as a standalone machine. Moreover, Docker also allows communication between containers using sort of host names that will resolve naturally from the inside, like any address.
In case it’s needed, a “bridge configuration” allows you to map the port of a container to a port of the host machine, exposing the service what’s outside the virtual network.
Of course storage is still a thing, and you can map a directory in the host server to a directory in the container OS. There’s no need for it to be an actual “mounted volume” like you would need to do with a classic virtual server, because the server is not actually a server, so you can skip all the crap that an OS needs to do to achieve the same result.
This is a very relevant topic because, as we previously said, you should never rely on what the container will store in it’s environment as it’s ephemeral so if your application needs to store something, you will need to “mount” a volume.
Be careful. Setting up a volume has one major effect on your deployment strategy. Stateless containers are -indeed- stateless so they can be moved around from one server to another without any side effect, but when a volume is mounted, it is obvious that you are binding that container to a location where the volume is actually accessible.
Build / Deploy
This is where things get tricky yet very familiar.
Images are not just big files with stuff in it. As the wording in Docker itself suggests, they are more similar to GIT repositories. Images are actually the history of changes that you made to them (they are called layers).
When you want to make changes to an image and make them ready to deploy, you apply the changes, literally commit the changes and push them to the main “repository” (which resides in a registry).
The servers running the containers will just need to pull the changes and restart the containers.
This implies two major side effects in your life:
- You have a quick way to deploy applications with a familiar paradigm
- You are not just deploying the application, but also changes to the full OS environment that will be necessary to run it!
Which means that at least 90% of the problems the new shiny update of the software will cause, even the ones that are OS related, can be fixed at your desk, without even bothering the infrastructure team.
Rather than making changes on a running container and committing them, the best way to build an image is to use a “Dockerfile“, which is no more than a recipe on how the image has to be built. You can stuff it with information like “what is the base image to start with” or, “copy this file”, or “launch this and that command”.
And these are the basics
Not really, but close. There’s a big amount of other options you can use once you get familiar with the basics. Like the ability to create a virtual subnet that spreads across multiple hosts, manage memory and CPU limitations, using containers host names rather than IPs, or describing how multiple containers are related and need to be started to run a complete microservice system (Docker Compose).
But for what concerns the very basics, this is it. And this is where your expectations come into play.
What about orchestration. What about configuration management. Guys, there’s none.
Specifically, configuration management is what killed my libido in my first steps with Docker.
Since images need to contain the software needed to run in a agnostic environment, configuration shouldn’t live in them, unless it’s certainly no subject to change or works smoothly as a global default. No configuration in the image, no configuration in the container. And if you copy the configuration in a running container, well it will be lost once the container restarts because -again- it’s stateless.
So what’s the trick? The not so obvious way to provide configuration to your container? Well, there’s not a definitive answer. And this is the fact I was trying to communicate since the beginning of this article: Docker doesn’t care about your damn configuration, because it’s just a virtualization platform.
Every solution I can propose you here has pros and cons, there’s no magic trick.
- Store the configuration on a volume that will be mounted when the container launches. It’s easy, it doesn’t require you to change your program and is compatible with distributed configuration managers, but now the service that’s running in the container is bound to the server it runs on. If the container has to move to another server, adjustments will be required and won’t happen automatically.
- Store the configuration keys in a startup script (Docker Compose can do that for you) that will expose them as environment variables within the container and the application will need to pick them up. Easy to do, effective and portable, but complex configuration files will simply be a no go.
- Store all the configurations within a key/value store container that will become essential part of the infrastructure. Smart, effective, portable, but then this container becomes a single point of failure and its storage need to live on a mounted volume. Plus you will need to change your program to make it happen.
Choose your flavor. Is there a right way to do it? Nah, not really.
Wait, where’s my orchestration again?
News say that with Docker 1.12 you get built in orchestration and that’s awesome. With that said, no orchestration has been available so far, out of the box.
Orchestration tools have been created to simplify your life. Rancher is the one I got familiar with and I definitely like it.
What you want to expect from these tools is the ability to store launch scripts, create stacks, virtualize a network across multiple servers and provide, where possible, a solution for your storage.
So… Do I still need servers?
The quick answer is: yes. Servers, or virtual servers if you will, aren’t going anywhere.
There are PaaS services out there that provide you the luxury to stop thinking about servers, and I’m sure some of them are extremely good, but I must be honest and say I tried some of them and it didn’t work well for me. I expect to change my mind in the future and will definitely tell you how that goes, but as of now, I still think the way to go is deploying your own (virtual or bare metal) servers.
What really changes is the way you work with them. I don’t have the experience yet to tell you what’s the downsides but I can definitely tell you some of the upsides.
- Simplifies the work for infrastructure teams as they don’t have to deal with the specific environment the application will need to run.
- Drastically improve the efficiency of the QA phase as the testing environment is way more similar to the production environment
- Manage redundancy, disaster recovery, updates and availability with less stress
This is not irrelevant. In the beginning of my experimentation I was kinda suspicious. Not because I don’t trust virtualization in general, but my mind couldn’t avoid bringing me back to the old VMWare days where there was a thin line between spinning another VM and sinking your server. Now, even with everything that happened after that (say, Amazon) I still couldn’t stop thinking that running one microservice into each virtualized container wasn’t going to do any good to performance.
Boy, I was wrong. Unlike classic virtualization, my understanding is that the magic Docker does by not really running a full OS makes the main process nearly as fast as if it was running natively in the host platform, because technically it is. I don’t have actual empirical data, but that was my feeling and has been confirmed by my readings.
Certainly there’s a price to pay, and it is related to every item that is not strictly connected to the process: one for all, virtual networking.
So if there’s a performance price to pay, is related to how you “accessorize” your deployment.
Am I liking Docker so far? Hell yes. Yet I cannot provide any specific details on whether I trust it or not because I honestly haven’t experienced the thrill to migrate to Docker in production. Yet.
Do I like the concept? I’m not a skeptical guy in general, but I generally don’t trust tech I didn’t have the time to understand how it works, at least in principle. Now that I (kind of) understand how it works, I find the whole thing extremely interesting and very powerful.
Is it the way to go? This question carries a lot of responsibility, but I take the risk: yes, it is the way to go.
Docker achieves two very important goals, and became so popular quickly because of them, not because of a general purpose excitement.
Is there more to be desired from Docker? Mmm. I don’t know really. At the beginning I was stunned by the absence of features that I thought to be essential, but then I realized that Docker was meant to be simple and achieve specific goals, and everything I needed would have been achieved by accessorizing it.
Very last note
As I’m not an Docker expert, as the title implies, if I got anything wrong in the article, feel free to reply and correct my mistakes! Thanks.