By Jordan A. - DevOps Expert
During DockerCon 2021, which took place on May 27, 2021, we were able to attend a number of very interesting conferences. One of them caught our attention because it presented basic concepts for writing Dockerfile files and thus creating effective and efficient containers.
This conference was given by Aaron Kalin, Technical Evangelist at Datadog: "Lessons Learned With Dockerfiles and Docker Builds" and offers seven lessons to remember, which I will detail and illustrate with concrete examples.

Lesson 1: Be mindful of the background image you use
Alpine images have been very popular in recent years due to their small size and limited number of vulnerabilities. This makes them an ideal basis for building your own Docker image.
Yes, but... With repeated use, Alpine images are no longer unanimously accepted by developers. One of the first issues concerns the use of musl rather than glibc (whereas the most popular distributions tend to use glibc). This means that elements compiled on Alpine distributions may not be usable on Ubuntu (and vice versa).
Furthermore, what about packages that are not yet available on Alpine but are available on other distributions, and are essential for handling dependencies in your code?
Aaron Kalin suggests we use "slim" versions of images instead, which are smaller in size, sometimes quite similar to the size of alpine images, as shown here:
$ docker image ls | grep python
python 3.9.1-slim-buster 8c84baace4b3 3 months ago 114MB
python 3.7.4-alpine3.9 32a1b98d0495 19 months ago 98.5MBLesson 2: Chain your RUN commands
The principle of chaining your RUN commands for installing dependencies allows you to have only one layer created (because for each command in the Dockerfile, a new layer is created) for your dependencies.
Aaron Kalin also recommends organizing the names of packages to be installed in alphabetical order with a single package per line (easier to maintain and reorganize).
For example, let's take aDockerfile with the packages to be installed on a single line:
FROM ubuntu:bionic
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
git \
nginx \
python
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
The following history is obtained:
$ docker history docker_1_layer:latest
IMAGE CREATED CREATED BY SIZE COMMENT
6892a0a503de 18 seconds ago /bin/sh -c #(nop) CMD
["nginx" "-g" "daemon… 0B
374bdfdad2b2 18 seconds ago /bin/sh -c #(nop) EXPOSE
80 0B
3f7201caacaa 20 seconds ago /bin/sh -c apt-get update
&& apt-get install… 189MB
81bcf752ac3d 8 days ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
<missing> 8 days ago /bin/sh -c mkdir -p
/run/systemd && echo 'do… 7B
<missing> 8 days ago /bin/sh -c [ -z "$(apt-get indextargets)" ] 0B
<missing> 8 days ago /bin/sh -c set -xe &&
echo '#!/bin/sh' > /… 745B
<missing> 8 days ago /bin/sh -c #(nop) ADD file:e05689b5b0d51a231… 63.1MB Now, let's perform the same experiment with the elements dispatched line by line in its Dockerfile:
FROM ubuntu:bionic
RUN apt-get update && apt-get install -y --no-install-recommends curl
RUN apt-get install -y git
RUN apt-get install -y nginx
RUN apt-get install -y python3
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
This gives us the following history:
IMAGE CREATED CREATED BY SIZE COMMENT
0d25db122b31 16 seconds ago /bin/sh -c #(nop) CMD
["nginx" "-g" "daemon… 0B
3cf4fb051b11 17 seconds ago /bin/sh -c #(nop) EXPOSE
80 0B
f736c0e7e9e6 18 seconds ago /bin/sh -c apt-get install
-y python3 29.4MB
c6c35fc73cad 28 seconds ago /bin/sh -c apt-get install
-y nginx 53.3MB
53e8b93b739a 39 seconds ago /bin/sh -c apt-get install
-y git 83.4MB
57e76bf1ae81 52 seconds ago /bin/sh -c apt-get update
&& apt-get install… 48.9MB
81bcf752ac3d 8 days ago /bin/sh -c #(nop) CMD
["/bin/bash"] 0B
<missing> 8 days ago /bin/sh -c mkdir -p
/run/systemd && echo 'do… 7B
<missing> 8 days ago /bin/sh -c [ -z "$(apt-get indextargets)" ] 0B
<missing> 8 days ago /bin/sh -c set -xe &&
echo '#!/bin/sh' > /… 745B
<missing> 8 days ago /bin/sh -c #(nop) ADD file:e05689b5b0d51a231… 63.1MBWe obtain two images that are different sizes, with the second one being more complex:
$ docker image ls | grep docker
docker_4_layers latest 0d25db122b31
About a minute ago 278MB
docker_1_layer latest 6892a0a503de
4 minutes ago 252MB
$Lesson 3: Clean up after installing packages
Let's return to our next example:
FROM ubuntu:bionic
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
git \
nginx \
python
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
Here, after installing the packages using apt, we did not perform any clearing. However, to further reduce the image size, and therefore its build and load time, you can add the following commands:
rm -rf /var/lib/apt/lists/* && apt cleanThis would give us:
FROM ubuntu:bionic
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
git \
nginx \
python \
&& rm -rf /var/lib/apt/lists/* \
&& apt clean
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
This allows us to compare the size of the image without cleaning: docker_1_layer, and the image after cleaning: docker_1_layer_clean:
$ docker image ls | grep docker
docker_1_layer_clean latest 494fb62a6e8c
16 seconds ago 216MB
docker_4_layers latest 0d25db122b31
7 minutes ago 278MB
docker_1_layer latest 6892a0a503de
10 minutes ago 252MBWe can see that the image where cleaning has been performed is smaller in size than the image where cleaning has not been performed. We have therefore succeeded in reducing the size of our image.
Lesson 4: Launch the installation of application dependencies separately at the end of the Dockerfile
Since these dependencies are likely to change from time to time as your code evolves, it is best to place them at the bottom of the Dockerfile. This avoids having to rebuild all subsequent layers in the event of changes.
Don't forget to specify that the tool should not cache any data (similar to apt).
Here is an example for installing Python libraries:
RUN pip install --no-cache-dir -r requirements.txtLesson 5: Don't forget to use .dockerignore
Aaron Kalin rightly reminds us to use the .dockerignore file wisely. It allows you to exclude directories and files from any copies that may be made within the Docker image.
Among the files and directories that are often forgotten not to include are: .git
If your code-related files are versioned using Git, you will have a hidden .git directory created inside your working directory.
What a shame that it's downloaded inside your Docker image?!
Other files that we tend to forget are all the Dockerfile files in our working directory.
So this is what our .dockerignore file would look like:
.git
Dockerfile*Lessons 6 and 7 will be covered in the next article. They concern the use of image construction through the multi-stage feature and the use of labels.
