Dockerfile best practices

The task of punching container images is common at work, and today’s article covers best practices for punching container images.

Dockerfile: docker-specific image build definition file.

Dockerfile

Dockerfile is a configuration file used to define the automatic image construction process in Docker. Dockerfile contains commands and other operations that need to be executed during the image construction process. Dockerfile can be used to more clearly and explicitly show the production process of a given Docker image. Because it is only a simple and small-volume file, the speed of transmission in the network and other media is fast, and container migration and cluster deployment can be realized faster.

In general, a Dockerfile is defined as a file named Dockerfile, which has no extension but is essentially a text file that can be created and edited using a common text editor or IDE.

The content of a Dockerfile is very simple, mainly in two forms, one is a comment line, the other is an instruction line. In Dockerfile, there is a separate set of instruction syntax, which is used to give the process to be executed during the image construction process. An instruction line in a Dockerfile is composed of instructions and their corresponding parameters.

Think of a common process in development as an analogy to a Dockerfile.

In a complete process of development, testing and deployment, the definition of the program running environment is usually carried out by the developers, because the development is more familiar with the details of the program running, more suitable for building a suitable program running environment.

On this premise, in order to facilitate testing and o&M to set up the same program running environment, it is common practice for developers to write a set of environment building manuals to help testers and O&M personnel understand the process of environment building.

Dockerfile is like such a setup manual because it contains a process for building containers.

Better than the manual of environment construction, Dockerfile can be built automatically under the container system, neither requiring testing and operation personnel to deeply understand the specific details of each software in the environment, nor manual execution of each construction process.

Using Dockerfile has many advantages over committing container changes and then migrating the image:

Dockerfiles are much smaller than image packages, making them easier to migrate and deploy quickly
The environment construction process is recorded in Dockerfile, and the order and logic of image construction can be seen intuitively
Building images using Dockerfile makes it easier to automate processes such as automatic deployment
It is easier to modify the Dockerfile file when modifying the setup details

In real development use, container submission is rarely chosen to build images, and Dockerfile is almost always used to build images.

Second, environment construction and mirror construction

A. Dockerfile compilation

The following is the complete Dockerfile used to build the official Redis image provided by Docker.

FROM Debian :stretch-slim RUN groupadd -r redis && useradd -r -g redis redis ENV GOSU_VERSION 1.10 RUN set-ex; \ \ fetchDeps=" \ ca-certificates \ dirmngr \ gnupg \ wget \ "; \ apt-get update; \ apt-get install -y --no-install-recommends $fetchDeps; \ rm -rf /var/lib/apt/lists/*; \ \ dpkgArch="$(dpkg --print-architecture | awk -F- '{ print $NF }')"; \ wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$dpkgArch"; \ wget -O /usr/local/bin/gosu.asc "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$dpkgArch.asc"; \ export GNUPGHOME="$(mktemp -d)"; \ gpg --keyserver ha.pool.sks-keyservers.net --recv-keys B42F6819007F00F88E364FD4036A9C25BF357DD4; \ gpg --batch --verify /usr/local/bin/gosu.asc /usr/local/bin/gosu; \ gpgconf --kill all; \ rm -r "$GNUPGHOME" /usr/local/bin/gosu.asc; \ chmod +x /usr/local/bin/gosu; \ gosu nobody true; \ \ apt-get purge -y --auto-remove $fetchDeps ENV REDIS_VERSION 3.2.12 ENV REDIS_DOWNLOAD_URL http://download.redis.io/releases/redis-3.2.12.tar.gz ENV REDIS_DOWNLOAD_SHA 98c4254ae1be4e452aa7884245471501c9aa657993e0318d88f048093e7f88fd RUN set -ex; \ \ buildDeps=' \ wget \ \ gcc \ libc6-dev \ make \ '; \ apt-get update; \ apt-get install -y $buildDeps --no-install-recommends; \ rm -rf /var/lib/apt/lists/*; \ \ wget -O redis.tar.gz "$REDIS_DOWNLOAD_URL"; \ echo "$REDIS_DOWNLOAD_SHA *redis.tar.gz" | sha256sum -c -; \ mkdir -p /usr/src/redis; \ tar -xzf redis.tar.gz -C /usr/src/redis --strip-components=1; \ rm redis.tar.gz; \ \ # disable Redis protected mode [1] as it is unnecessary in context of Docker # (ports are not automatically exposed when running inside Docker, but rather explicitly by specifying -p / -P) # [1]: https://github.com/antirez/redis/commit/edd4d555df57dc84265fdfb4ef59a4678832f6da grep -q '^#define CONFIG_DEFAULT_PROTECTED_MODE 1$' /usr/src/redis/src/server.h; \ sed -ri 's! ^(#define CONFIG_DEFAULT_PROTECTED_MODE) 1$! 1 0 \! ' /usr/src/redis/src/server.h; \ grep -q '^#define CONFIG_DEFAULT_PROTECTED_MODE 0$' /usr/src/redis/src/server.h; \ # for future reference, we modify this directly in the source instead of just supplying a default configuration flag because apparently "if you specify any argument to redis-server, [it assumes] you are going to specify everything" # see also https://github.com/docker-library/redis/issues/4#issuecomment-50780840 # (more exactly, this makes sure the default behavior of "save on SIGTERM" stays functional by default) \ make -C /usr/src/redis -j "$(nproc)"; \ make -C /usr/src/redis install; \ \ rm -r /usr/src/redis; \ \ apt-get purge -y --auto-remove $buildDeps RUN mkdir /data && chown redis:redis /data VOLUME /data WORKDIR /data COPY  docker-entrypoint.sh /usr/local/bin/ ENTRYPOINT ["docker-entrypoint.sh"] EXPOSE 6379 CMD ["redis-server"]Copy the code

B. Dockerfile structure

When a build command is called for Docker to build an image from a Dockerfile, Docker parses the instructions in the Dockerfile one by one and performs different actions based on their different meanings.

Dockerfile directives can be divided into five simple categories:

Base directive – Used to define the base and properties of the new image
Control instructions – The core part that guides the image build and describes the commands that need to be executed during the image build
Import directive – Used to import external files directly into the build image
Execute instructions – Specifies scripts or commands that need to be executed at startup for a container created based on the image
Configuration command – through the configuration command to configure the network, users, etc. \

Common Dockerfile command

The following common Dockerfile directive contains 90% of the common functions.

A, the FROM

Typically, instead of building an image from scratch, an existing image will be selected as the base for the new image.

In Dockerfile, a base image is specified by the FROM directive, and all subsequent directives are expanded based on this image. In the process of image construction, Docker will first obtain the given basic image, and then build operations from the image.

The FROM directive supports three forms:

FROM <image> [AS <name>]
FROM <image>[:<tag>] [AS <name>]
FROM <image>[@<digest>] [AS <name>]
Copy the code

Selecting a base image is fundamental to building a new image, so the first instruction in Dockerfile must be the FROM instruction, because without the base image, nothing can be built. The “FROM” command can appear multiple times in a Dockerfile. When “FROM” appears a second time or later, it means that at this point in the build, the contents of the current image should be merged into the contents of the image at this point in the build.

B, the RUN

Although the image is built according to the instructions, but the instructions are only guidance, and ultimately most of the content is the command from the console to the program, and the RUN command is used to send commands to the console.

After the RUN command, the commands that need to be executed are directly concatenated. At build time, Docker executes these commands and records their modifications to the file system, forming mirrored changes.

RUN <command>
RUN ["executable", "param1", "param2"]
Copy the code

The RUN command supports **** newline. If the length of a single line is too long, you can cut the content for easy reading.

C, ENTRYPOINT and CMD

Image-based container, when the container is started, according to a command defined by the image to start the container process number 1. This command is defined through ENTRYPOINT and CMD in Dockerfile.

ENTRYPOINT ["executable", "param1", "param2"]
ENTRYPOINT command param1 param2

CMD ["executable","param1","param2"]
CMD ["param1","param2"]
CMD command param1 param2
Copy the code

The ENTRYPOINT directive is similar to the CMD directive in that both give commands to execute and they can be empty or not indicated in a Dockerfile.

If both ENTRYPOINT and CMD are provided, the contents in CMD will be used as parameters of the ENTRYPOINT definition command. Finally, the commands specified in ENTRYPOINT will be executed.

D, EXPOSE

Since the mirror is built with a better understanding of the logic of the application in the mirror and what ports it needs to receive and process requests from, it makes more sense to define port exposure in the mirror.

With the EXPOSE directive, you can specify ports to EXPOSE for your mirror.

EXPOSE <port> [<port>/<protocol>...]
Copy the code

With the EXPOSE directive, the port exposure definition for the mirror is configured, so that containers created based on this image can directly allow access to those ports from other containers when they are connected via the **–link ** option.

E, VOLUME

In some applications, data needs to be persisted. For example, the folder where the data is stored in the database needs to be handled separately. Data volumes can handle these problems.

Data volumes need to be defined by the **-v ** option when creating containers. Sometimes, because users do not know much about images, they may miss the creation of data volumes, causing unnecessary trouble.

The person who created the image knows best how the program works, so it is best for the producer to define the data volume. In Dockerfile, the VOLUME directive defines the data volumes automatically created by the container based on the image.

VOLUME ["/data"]
Copy the code

Directories defined in the VOLUME command are automatically created as data volumes when a container is created based on a new image. You do not need to use the **-v ** option.

F, COPY and ADD

When creating a new image, you may need to import some software configuration, program code, and execution script directly into the file system of the image. Using the COPY or ADD command, you can directly COPY content from the file system of the host to the file system of the image.

COPY [--chown=<user>:<group>] <src>... <dest>
ADD [--chown=<user>:<group>] <src>... <dest>

COPY [--chown=<user>:<group>] ["<src>",... "<dest>"]
ADD [--chown=<user>:<group>] ["<src>",... "<dest>"]
Copy the code

COPY and ADD are defined in the same way. The difference between them is that ADD can use the URL of the network as the SRC source and automatically decompress the source file when it is identified as a compressed package, while COPY does not have these two capabilities.

While it may seem a little weak, COPY is simpler for scenarios where you don’t want source files to be decompressed or where there are no network requests.

Build the mirror

After writing Dockerfile, use the ** docker build ** command to build the image.

docker build ./webapp
Copy the code

The parameter of docker build is the directory path (local path or URL path), which will be used as the environment directory of the build. For example, when COPY or ADD is used to COPY files to a new image, this directory will be used as the base directory.

By default, docker builds also look for a file named Dockerfile from this directory and use it as the source of the Dockerfile content. If the Dockerfile file path is not in this directory, or if there is another file name, you can use the **-f ** option to specify the path of the Dockerfile file separately.

docker build -t webapp:latest -f ./webapp/a.Dockerfile ./webapp
Copy the code

Carry the **-t * option at build time to specify the name of the newly generated image.

docker build -t webapp:latest ./webapp
Copy the code

A. Use variables in A build

In the actual writing of Dockerfile, the instructions related to setting up the environment account for most of the instructions. When setting up the running environment required by the program, it is unavoidable to involve some variables, such as the version of the dependent software, the parameters of compilation and so on.

You can write the data directly to a Dockerfile, but the variables can be adjusted frequently. In a Dockerfile, you can create a parameter variable with the ARG directive. You can pass the parameter variable in at build time with the build directive, and use it in the Dockerfile.

For example, if you want to control the version of a program in a Dockerfile by using parameter variables, and install the software of the specified version at build time, you can replace the part of the version definition by using arG-defined parameters as placeholders.

FROM debian:stretch-slim ARG TOMCAT_MAJOR ARG TOMCAT_VERSION RUN wget -O tomcat.tar.gz "https://www.apache.org/dyn/closer.cgi?action=download&filename=tomcat/tomcat-$TOMCAT_MAJOR/v$TOMCAT_VERSION/bin/apache- tomcat-$TOMCAT_VERSION.tar.gz"Copy the code

In the above example, the version number of Tomcat is defined as a parameter variable through the ARG directive. When the Tomcat package is downloaded, the version number in the download address is replaced with the variable. With this definition, it is easy to switch Tomcat versions and rebuild the image without making significant changes to the Dockerfile.

If you want to build a Tomcat image from this Dockerfile, you can set the parameter variable at build time with the –build-arg option of ** Docker build.

Docker build --build-arg TOMCAT_MAJOR=8 --build-arg TOMCAT_VERSION=8.0.53 -t tomcat:8.0./tomcatCopy the code

B. Environment variables

Environment variables are also used to define parameters. Similar to the ARG directive, environment variables are defined using the ENV directive.

FROM Debian :stretch-slim ENV TOMCAT_MAJOR 8 ENV TOMCAT_VERSION 8.0.53 RUN wget -o tomcat.tar.gz "https://www.apache.org/dyn/closer.cgi?action=download&filename=tomcat/tomcat-$TOMCAT_MAJOR/v$TOMCAT_VERSION/bin/apache- tomcat-$TOMCAT_VERSION.tar.gz"Copy the code

Environment variables are used in the same way as parameter variables to directly replace the contents of instruction parameters.

Unlike parameter variables that can only affect the build process, environment variables can affect not only the build, but also the container created based on this image. The essence of setting environment variables is to define the operating system environment variables, so that the running container also has these variables, and the program running in the container can get the values of these variables.

Another difference is that the value of the environment variable is not passed in the build instruction, but written in the Dockerfile, so if you want to change the value of the environment variable, you need to go to the Dockerfile. Even so, as long as we put ENV definitions at the front of the Dockerfile where they can be easily found, they can quickly help switch things in the image environment.

Since environment variables are still valid when the container runs, they can also be overwritten when the container is run. You can modify the value of an environment variable or define a new one by using either the **-e or –env ** options when creating the container.

Docker run -e MYSQL_ROOT_PASSWORD=my-secret-pw -d mysql:5.7Copy the code

This usage is very common in development, and because of the way this allows for runtime configuration, environment variables, and the ENV directive that defines them, are the more commonly used directives that are preferred to operate on variables.

ENV and ARG directives define parameters in the form of **$+ NAME **, so there is a possibility of conflict between their definitions. For this scenario, variables defined by the ENV directive will always override variables defined by the ARG, even if they are timed in the opposite order.

C. Merge commands

In the Dockerfile of the Redis image above, a lot of code is aggregated in the RUN command.

In fact, there is not much difference between the following two methods of writing for the constructed environment.

RUN apt-get update; \
    apt-get install -y --no-install-recommends $fetchDeps; \
    rm -rf /var/lib/apt/lists/*;

RUN apt-get update
RUN apt-get install -y --no-install-recommends $fetchDeps
RUN rm -rf /var/lib/apt/lists/*
Copy the code

What looks like a continuous mirror build process is actually made up of several small segments. Every time a command that can form changes to the file system is executed, Docker will first start a container based on the result of the previous command, run the content of this command in the container, and then package the result into a mirror layer, and so on and so on, and finally form an image.

Based on this principle, most mirrors combine commands into a single instruction, which not only reduces the number of mirror layers, but also reduces the number of containers repeatedly created during the image construction process, which increases the speed of image construction.

D. Build cache

In the process of image construction, Docker also supports cache strategy to improve the speed of image construction.

Since the image is a combination of image layers created by multiple instructions, if the newly compiled image layer is judged to be the same as the existing image layer, the result of the previous build can be directly used without executing the build instruction, which is the principle of the image build cache.

Based on this principle, if conditions permit, it is recommended to put the build process that is not easy to change in the front of Dockerfile, and make full use of the build cache to improve the speed of image build. In addition, the combination of instructions should not be excessive, but to break up the volatile and non-volatile processes into different instructions.

In other cases, when you don’t want Docker to use the build cache when building images, you can disable it with the **–no-cache ** option.

docker build --no-cache ./webapp
Copy the code

V. Copying cases

Writing Dockerfile, reading and thinking about previous works is essential.

Docker Hub officially provided by Docker is the central repository of Docker images. In addition to rich images, another benefit it brings is that most of its images can directly provide Dockerfile files for learning.

Finally, to share a Docker introduction to practical learning document, free to share, need some links to receive zhuanlan.zhihu.com/p/396653935

This article from: zhuanlan.zhihu.com/p/57335983

Dockerfile

Second, environment construction and mirror construction

Common Dockerfile command

Build the mirror

V. Copying cases

Related Posts

【Dubbo】 The Consumer thread model

Nebula Graph Overview: Nebula Graph Overview

The actual combat tutorial that oneself writes is sweet! Tell me about my blog!