TL;DR Before you start reading this, I want to make it clear that I absolutely
don't hate Docker or the application container idea in general, at all!. I
really see containers become a new way of doing things in addition to the
existing technologies. In fact, I use containers myself more and more.
Currently I'm using Docker for local development because it's so easy to get
your environment up and running in just e few seconds. But of course, that is
"local" development. Things start to get interesting when you want to deploy
over multiple Docker hosts in a production environment.
At the "Pragmatic Docker Day" a lot of people who were using (some even in production) or
experimenting with Docker showed up. Other people were completely new to Docker
so there was a good mix.
During the Open Spaces in the afternoon we had a group of people who decided to
stay outside (the weather was really to nice to stay inside) and started
discussing the talks that were given in the morning sessions. This evolved in a
rather good discussion about everyone's personal view on the current state of
containers and what they might bring in the future. People chimed in and added
their opinion to the conversation
That inspired me to write about the following items which are a combination of
the things that came up during the conversations and my own view on the current
state of Docker.
The Docker file
A lot of people are now using some configuration management tool and have
invested quite some time in their tool of choice to deploy and manage the state
of their infrastructure. Docker provides the Dockerfile to build/configure your
container images and it feels a bit like a "dirty" way/hack to do this given
that config management tools provide some nice features.
Quite some people are using their config management tool to build their
container images. I for instance upload my Ansible playbooks into the image
(during build) and then run them. This allows me to reuse existing work I know that works. And I can use it for both containers and non-containers.
It would have been nice if Docker somehow provided a way to integrate the
exiting configuration management tools a bit better. Vagrant does a better job
As far as I know you also can't use variables (think Puppet Hiera or Ansible
Inventory) inside your Dockerfile. Something configuration management tools
happen to do be very good at.
When building more complex Docker images you notice that a lot of Bash scripting is
used to prep the image and make it do what you want. Things like passing
variables into configuration files, creating users, preparing storage, configure and
start services, etc.. While Bash is not necessarily a bad thing, it all feels
like a workaround for things that are so simple when not using containers.
Dev vs Ops all over again?
The people I talked to agreed on the fact that Docker is rather developer
focused and that it allows them to build images containing a lot of stuff where
you might have no control over. It abstracts away possible issues. The container
works so all is well..right?
I believe that when you start building and using containers the DevOps aspect is
more important then ever. If for instance a CVE is found in a library/service that has
been included in the container image you'll need to update this in your base
image and then rolled out through your deployment chain. To make this
possible all stakeholders must know what is included, and in which version of
the Docker image. Needless to say this needs both ops and devs working together.
I don't think there's a need for "separation of concerns" as Docker likes to
advocate. Haven't we learned that creating silo's isn't the best idea?
Everything in the way you used to work becomes different once you start using
containers. The fact that you can't ssh into something or let your
configuration management make some changes just feels awkward.
By default Docker creates a Linux Bridge on the host where it creates interfaces
for each container that gets started. It then adjusts the iptables nat table
to pass traffic entering a port on the host to the exposed port inside the
To have a more advanced network configuration you need to look at tools like
weave, flannel, etc.. Which require more research to see what fits your specific
use case best.
Recently I was wondering if it was possible to have multiple nics inside your
container because I wanted this to test Ansible playbooks that configure
multiple nics. Currently it's not possible but there's a ticket open on GitHub
https://github.com/docker/docker/issues/1824 which doesn't give me much hope.
Once you go beyond playing with containers on your laptop and start using
multiple docker hosts to scale your applications, you need to have a way to know
where the specific service you want to connect to is running and on what port it
is running. You probably don't want to manually define ports per container on
each host because that will become tedious quite fast. This is were tools like
Consul, etcd etc.. come in. Again some extra tooling/complexity.
You will always have something that needs persistence and when you do, you'll
need storage. Now, when using containers the Docker way, you are assumed to put
as much as possible inside the container image. But some things like log files,
configuration files, application generated data, etc.. are a moving target.
Docker provides volumes to pass storage from the host inside a container.
Basically you map a path on the host to a path inside the container. But this
poses some questions like, how do I share this in case the container gets
started, how can I make sure this is secure? How do I manage all these volumes?
What is the best way to share this among different hosts? ...
One way to consolidate your volumes is to use "data-only" containers. This means
that you run a container with some volumes attached to it and then link to them
from other containers so they all use a central place to store data. This works
but has some drawbacks imho.
This container just needs to exist (it doesn't even need to be running) and as
long as this container or a container that links to it exists, the volumes are
kept on the system. Now, if you by accident delete the container holding the
volumes or you delete the last container linking to them, you loose all your
data. With containers coming and going, it can become tricky to keep track of
this and making mistakes at this level has some serious consequences.
One of the "advantages" that Docker brings is the fact that you can pull images
from the Docker hub and from what I have read this is in most cases encouraged.
Now, everyone I know who runs a virtualization platform will never pull a
Virtual Appliance and run it without feeling dirty. when using a cloud
platform, chances are that you are using prebuild images to deploy new instances
from. This is analogue to the Docker images with that difference that people
who care about their infrastructure build their own images. Now most Linux
distributions provide an "official" Docker image. These are the so called
"trusted" images which I think is fine to use as a base image for everything
else. But when I search the Docker Hub for Redis I get 1546 results. Do you
trust all of them and would you use them in your environment?
What can go wrong with pulling an OpenVPN container. Right..?
This is also an interesting read:
Currently there's no user namespacing which means that if a UID inside the
docker container matches the UID of a user on the host, that user will have
access to the host with the same permissions. This is one of the reasons why you
should not run processes as the root user inside containers (and outside). But
even then you need to be careful with what you're doing.
Containers, containers, containers..
When you run more and more stuff in containers, you'll end up with a few
hundred, thousand or even more containers. If you're lucky they all share the
same base image. And even if they do, you still need to update them with fixes
and security patches which results in newer base images. At this point all your
existing containers should be rebuild and redeployed. welcome to the immutable
So the "problem" just shifts up a layer. A Layer where the developers have more
control over what gets added. What do you do when the next OpenSSL bug pops up?
Do you know which containers has which OpenSSL version..?
Everyone seems to be building these mini OS's these days like CoreOS,
ProjectAtomic, RancherOS, etc.. The idea is that updating the base OS is a
breeze (reboot, AB partition etc..) and all services we need are running inside
That's all nice but people with a sysadmin background will quickly start asking
questions like, can I do software raid? Can I add my own monitoring on this
host? Can I integrate with my storage setup? etc...
What I wanted to point out is that when you decide to start using containers,
keep in mind that this means you'll need to change your mindset and be ready to
learn quite some new ways to do things.
While Docker is still young and has some shortcomings I really enjoy working
with it on my laptop and use it for testing/CI purposes. It's also exciting (and
scary at the same time) to see how fast all of this evolves.
I've been writing this post on and off for some weeks and recently some
announcements at Dockercon might address some of the above issues. Anyway, if
you've read until here, I want to thank you and good luck with all your