Dockerfiles can be a contentious topic in the Docker world. Their simplicity makes them extremely easy to grasp, but this simplicity can be a source of frustration for users familiar with the flexibility of typical build and configuration management tools. However, it has recently become possible add to significant flexibility to Dockerfiles with just a touch of trickery.

These tricks only work in the 1.8.X releases of Docker - the feature they rely on was sadly removed in later releases.

Dockerfiles

A Dockerfile is a file containing a list of instructions (defined by a simple DSL) for constructing a Docker image - the official documentation has a few examples, such as this PostgreSQL Dockerfile. You can think of them as shell scripts, with some special Docker-interpreted commands…​and some limitations.

Reproducibility

The phrase 'Reproducible build' is not precisely defined and can vary wildly in usage from "look at the README" (shortly followed by "well it works on my machine") to the Very Serious "the build is byte for byte identical every time". The Debian maintainers, who have been working on the latter for two years so far, will likely attest this isn’t always easy [1].

Dockerfiles are a pragmatic push a few steps along the right path by locking down the execution environment. Running docker build should do the same thing everywhere [2]…​but tarballs will still contain timestamps, causing byte differences - it’s not a magic bullet!

Users familiar with flexible execution flows and build-time decisions may find the linear nature of Dockerfiles vexing, especially now there’s no chance of change - Dockerfile syntax is frozen [3] in favour of allowing users to create their own build tools, Dockramp being an early example. My opinion on the limitations of Dockerfiles [4] is that they’re mostly futile - as soon as you go to the network, you’ve lost reproducibility and no limitations can help you.

FROM, ADD and RUN can all hit the network. Even if remote content never changed (hint: it does, even tagged images), builds may fail if your company is offline because of a) a router firmware upgrade bug or b) someone cutting through telephone exchange fiber (two real examples).

Trickery

If you accept that pragmatic Dockerfile flexibility is a good thing [5], it follows that flexibility increases are not necessarily bad. If you disagree, you may want to stop reading here!

The trick is that, as of Docker 1.8, running container names get inserted into /etc/hosts of all other containers…​including ones created as part of docker build. I have a short post on this if you want to read more. This lets you expose services to your build [6] and use them to subvert Dockerfile limitations.

That’s it! Let’s see some practical examples - please read the security notes on each and be aware that they won’t work with automated builds on the Docker Hub. You’ll need Docker >= 1.8.

The Basics: Running a Secret Server

One important behaviour of Dockerfiles is that files and commands are kept forever once used, with many attempts to fix this in Dockerfile syntax (merging, transactions, private volumes, add-and-remove) not going anywhere. People wanting to use secrets must currently resort to squashing their image [7] or waiting for the secrets roadmap to bear fruit.

An alternative is to create a very simple 'secrets server' container and use it to retrieve secrets during the build when they’re needed, removing them after use. I’ve created an example image to get you started!

Terminal Session
$ cd $(mktemp -d)
$ echo "mygoodpassword" > password
$ cat >Dockerfile <<'EOF'
FROM alpine:3.2
RUN wget -O /getsecret http://dsecret/getsecret && chmod +x /getsecret
ENV SECRET /getsecret dsecret:4444
RUN echo "root:$($SECRET password)" | chpasswd root
EOF
$ docker run -d -v $(pwd):/srv/secrets --name dsecret aidanhs/secret-server
[...]
$ cat Dockerfile | docker build -t test -
[...]
Successfully built 109031a0ffc2
$ docker run --rm alpine:3.2 head -n1 /etc/shadow
root:::0:::::
$ docker run --rm test head -n1 /etc/shadow
root:$6$PJtQShaN$l/4tOrfO2p3EIpy7jH4bvyL7FXK9JA941Y7T5FFoUdp9Fl1rAFxS9T [...]

In short: we’ve updated the password of the root user without the password touching the disk of the container. You could write SSH keys for use within a step, or (looking at it from a higher level) run the container on a locked-down build server to expose sensitive information at build time without revealing everything to people who don’t need to know. You can read more at the GitHub Repository and bear in mind these don’t have to be secrets - you could serve variables to influence your Docker build!

Security impact: starting this container lets anyone on your machine read files from the secrets directory.

Streamlining your experience: Using an SSH Agent

Exposing the SSH agent to the container is another common request for Docker, with no easy workaround if you want to use Dockerfiles. The secret server works ok for non-password-protected keys, but it’s still a pain to remember to delete files at the end of each step. No more! Make sure you have an SSH agent running (check $SSH_AUTH_SOCK) and then try the below.

Terminal Session
$ cd $(mktemp -d)
$ cat >Dockerfile <<'EOF'
FROM alpine:3.2
RUN apk update && apk add socat openssh-client
ENV SSH_AUTH_SOCK /tmp/ssh/auth.sock
RUN dir=$(dirname $SSH_AUTH_SOCK) && mkdir -p $dir && chmod 777 $dir
RUN mkdir -p ~/.ssh && printf "Host *\n\
  StrictHostKeyChecking no\n\
  ProxyCommand setsid socat UNIX-LISTEN:$SSH_AUTH_SOCK,unlink-early,mode=777 TCP:dsshagent:5522 >/dev/null 2>&1 & \
    sleep 0.5 && socat - TCP:%%h:%%p\n\
" > ~/.ssh/config
RUN ssh user@myserver.com hostname > /otherhostname
EOF
$ docker run -d -v $(dirname $SSH_AUTH_SOCK):/s$(dirname $SSH_AUTH_SOCK) --name=dsshagent aidanhs/sshagent-socket
[...]
$ cat Dockerfile | docker build -t test -
[...]
Successfully built c5592e338837
$ docker run --rm test cat /otherhostname
myserver

You may need to add $SSH_AUTH_SOCK as an argument after aidanhs/sshagent-socket depending on your distro. And make sure you replace user@myserver.com with something you have passwordless access to on the host. As a tip, I found I needed to use fully qualified hostnames or IP addresses when SSHing from Alpine.

Of course, this trick doesn’t have to be done at build time - you may find it handy to be able to use your SSH keys when running a container interactively, which this supports just fine! It’s worth checking out the GitHub Repository for some known problems with this PoC implementation of the idea and to raise an issue if things don’t work for you.

Security impact: starting this container lets anyone on your machine use your decrypted keys to log into remote machines.

The Enabler: Running Docker inside a Dockerfile

Exposing the Docker daemon over TCP on an external port is an interesting thing to do and permits some nice tricks [8]. It makes sense that people would request access to it in Dockerfiles, though it doesn’t seem to be as avidly pursued as the previous features. Despite this, it’s much more interesting because it acts as an enabler for a bunch of other tricks. For example, it becomes easy to implement the BUILD instruction (running docker build inside docker build), nested builds (a way of selectively extracting parts of sub-builds to place in the target image) and image functions (a more powerful implementation of the previous).

One use-case I found particularly interesting is using a Dockerfile to to orchestrate multi-container builds where the containers need to talk to each other, a previously woefully underserved requirement. There are also possibilities to use provisioning tools like Ansible (or anything with the ability to connect to Docker) from a reliable environment without polluting the target container.

As there are so many possibilities here I’ll only cover a few. I look forward to seeing what else people come up with! The prerequisites are an image with a Docker binary (you can use aidanhs/ubuntu-docker:14.04.3-1.8.0), the running Docker socket container and an appropriate ENV instruction in any Dockerfiles.

Here’s how to start the Docker socket container:

Terminal Session
$ docker run -d -v /var/run/docker.sock:/docker.sock --name dsocket aidanhs/socket-socat
e68076d2c9b802e71904dc6b7399e29cfccade545fcc1476375fc77f25c5be33

Security impact: starting this container gives anyone on your machine root access.

Docker Build in Docker Build

This one is dead simple. Just create the following two files in a directory:

Dockerfile.within
FROM aidanhs/ubuntu-docker:14.04.3-1.8.0
ENV DOCKER_HOST tcp://dsocket:2375
COPY . /work
WORKDIR /work
RUN docker build -t inner1 -f Dockerfile2 .
RUN /bin/echo -e "FROM alpine:3.2\nRUN echo xyz > /X\n" | docker build -t inner2 -

Dockerfile2
FROM ubuntu:14.04.3
RUN echo abc > /X

And run docker build -f Dockerfile.within . - two images (inner1, inner2) for the price of one! As a bonus, caching will work exactly as you would hope

Nested Builds

Confusingly, the words 'Nested Build' seems to commonly be used to focus more on extract files from built images rather than about actually doing a build inside another. Taking the images we’ve built above, let’s create a third image. Just drop this Dockerfile in the same directory:

Dockerfile.nest
FROM aidanhs/ubuntu-docker:14.04.3-1.8.0
ENV DOCKER_HOST tcp://dsocket:2375
RUN CID=$(docker create inner1 X) && docker cp $CID:/X /v1 && docker rm $CID
RUN CID=$(docker create inner2 X) && docker cp $CID:/X /v2 && docker rm $CID
# RUN /bin/echo -e "FROM scratch\nCOPY /v1 /v2 /\n" | docker build -t inner3 -

And run docker build --no-cache -f Dockerfile.nest . (a complete example could just add the contents of Dockerfile.within from the previous trick with Docker build in Docker build).

The image output from Dockerfile.nest contains build contents from two different images. The commented out line demonstrates how to insert these build outputs into a third, more minimal image. Be warned, omitting --no-cache when running a Dockerfile with more exotic use of Docker (i.e. more than build) needs some caution! The general format above is safe (you just end up with redundant containers if docker cp fails) but using things like docker run, requires a strong understanding of Dockerfile caching to avoid build inconsistencies.

In case you’re wondering, the X argument at the end of the create command is to make sure Docker doesn’t complain for images with no CMD or ENTRYPOINT - the path doesn’t need to exist.

Ansible Multi-Container Provisioning

Don’t worry if you don’t know Ansible, I’m not going to dive into the syntax here! And calm down if you’ve started getting worked up at an anticipated install of SSH! Ansible has a plugin for 1.9 to give direct Docker container provisioning by using docker exec. As of Ansible 2 (in beta) this will be a core module! Create the following Dockerfile in a fresh directory.

Dockerfile.ansible
FROM aidanhs/ubuntu-docker:14.04.3-1.8.0
ENV DOCKER_HOST tcp://dsocket:2375
RUN apt-get update && apt-get install -y git python-dev python-pip && \
pip install git+https://github.com/ansible/ansible.git@v2.0.0-0.3.beta1
RUN for c in c1 c2; do docker run -d --name $c centos:6 sleep infinity; done

RUN ansible all -c docker -i c1,c2, -m shell -a 'echo $(hostname) > /x'

After running docker build --no-cache -f Dockerfile.ansible ., you can run the following to see that the containers have created the files as instructed.

Terminal Session
$ for c in c1 c2; do docker exec $c cat /x; done
53fbc0ddfa54
562938bcc160

As with the previous trick. you need to be aware of the Dockerfile cache and handle scenarios where containers are still running from a previous failure. However, the cache here can be useful - if you split up your playbook executions into layers, running the Dockerfile again will resume from the most recently failed playbook! A great timesaver for trying to test your playbooks all the way through if you jump into the container and hack some fixes in, rather than starting from scratch.

When trying this, be aware that there’s no reason you need Ansible - feel free to just make appropriate calls to docker exec yourself! On the other hand, if you love Ansible and can’t wait to start using it with Docker, I have some fixes for the use of sudo I need to contribute back.

Notes

Why Not a Custom Tool?

For better or worse the Dockerfile is the lingua franca of the Docker world, the only build tool supported by the daemon and the Docker Hub and the first thing people are likely to look for when you say "I’ve built a Docker image". Rolling your own tool works, but it saddles you with a maintainence burden, the need to provide documentation for a new tool and looks of disapproval from people who are fortunate enough to be able to use Dockerfiles as-is.

All that was required by each trick above was the startup of a single other container, i.e. a single command to run, and a couple of setup lines in a Dockerfile - very friendly for beginners.

Everything above is insecure!

Yes it is. As has hopefully been communicated, you should never do these on a machine where you don’t trust the other users. That said, if you’re using Docker you hopefully generally trust the people you work with - giving someone unrestricted access to the Docker daemon on a machine is no worse than running all of the above at the same time.

I want to read more things like this!

I’m currently co-authoring a book on Docker where our focus is on practicality rather than conceptual purity - we want to help you get stuff done!


1 Do you ever use the output of find without sorting? The order of output is not guaranteed!
2 Watch out for Docker 1.9, the build-time variables feature was merged three weeks ago which effectively allows you to pass arguments to Dockerfiles.
3 For those wondering, build-time variables were approved pre-freeze.
4 As quoted over a year ago in a thread about alternative build tools! The previous post in that thread from Solomon is a good read on reproducibility in general.
5 The official PostgreSQL example uses unpinned packages so clearly the author liked this form of flexibility!
6 You could do this before by exposing services on other hosts, it’s just a bit easier now.
7 This has been a long desired feature for Docker itself.
8 The horrendous security issues aside.