Dockerfile Best Practices - Zhihu - Moment For Technology

Fan Zi joined NetEase in 2016 and has been responsible for several important mobile games projects. Focus on automation, container, cloud, etc.Copy the code

In the field of containers, container image proposed by Docker has become the de facto standard for container packaging and delivery. How to write an elegant Dockerfile? An article is given in the official documentation of Docker

Best practices for writing Dockerfiles.

(g. 126. FM / 03 ncyhs)

This article on this basis to do some abridgment, trying to let everyone in a short time to write a good Dockerfile. This article is divided into three parts. First, it will directly give a reference template of Dockerfile, and then explain why the template is organized like this. Finally, it will supplement some common problems in the process of writing.

A simple Dockerfile reference template

There are nearly 20 Dockerfile instructions in the official reference document of Docker, but we usually use no more than 10 when writing. Therefore, here is a reference template for Dockerfile that covers almost all usage scenarios.

FROM base_image:tag # reference base image * required * ARG arg_key[=default_value1] # declare variables ENV env_key=value2 # Declare environment variables # Build almost invariable parts, such as the overall directory structure, COPY SRC DST RUN command1 && Command2... WORKDIR /path/to/work/dir/WORKDIR /path/to/work/dir/WORKDIR /path/to/work/dir COPY SRC DST RUN command5 && command6... CMD ["--options"] # Specify the default parameters of the default command when the container is startedCopy the code

Build an efficient mirror life cycle

An important feature of a container is its ability to iterate quickly, so it should be as simple and efficient as possible in all aspects of container mirroring iteration.

1. The image build

Compact context: Every time you build, the context is copied to the Docker daemon, so remove irrelevant parts of the context
Multi-layer mirroring: If the mirroring is complex, it is usually divided into base mirroring (which is suitable for a variety of applications and basically unchanged) and application mirroring, which reduces build steps FROM base mirroring
Leverage build cache: Every time it builds, the Docker Daemon defaults to starting with the parent image that is already in the cache and comparing the next instruction with all child images derived from that base image to see if one of them was built using exactly the same instruction. If not, the cache is invalid. Therefore, in order to improve cache hit ratio, when writing dockerfiles, you should try to organize them as frequently as possible (such as the template above).
Reduce the layers: RUN, COPY, ADD and other commands will generate the corresponding layer during build. In older Docker versions, the number of layers in the image needs to be minimized to ensure its performance. Therefore, use&&To connect multiple RUN commands is a common method (as in the template above)
Use multi-stage Builds: New feature, more on this later

2. The mirror pull

Docker official details the docker mirror and containers stored on the host way: docs.docker.com/storage/sto… To put it simply:

Image layer, read-only, shared by multiple containers using the same image. The image is layered according to layers:

Each layer has its own ID
If different mirrors have a layer with the same ID, they share one

Container layer, writable, copy-on-write, where the container changes at run time

Depending on how the image is stored, we can also speed up the pull process of the image:

Multi-layer mirroring: As with build, use the base mirroring already stored locally to reduce the size of the pull required
Reuse the same layer with image Layer: similar to build with cache, use the layer already stored locally to reduce the size required to pull
Image preheating: Pull an image in advance or at idle time

Q&A

1. Note that the instructions in Dockerfile are executed one by one and are independent of each other

The WORKDIR for the second RUN is the same as the WORKDIR for the second RUN. /some/dir RUN CD /some/dir RUN bash script.sh # RUN CD /some/dir && bash script.sh RUN bash /some/dir/script.shCopy the code

2. Beware of “excessive” caching

As mentioned above, each instruction in Dockerfile is executed one by one and is independent of each other. Most instructions will generate a layer in build and will be cached. This mechanism works fine for the most part, but sometimes it can cause problems:

# Dockerfile2 FROM Ubuntu :18.04 RUN apt-get update RUN apt-get install -y nginx # Dockerfile2 FROM Ubuntu :18.04 RUN apt-get update RUN apt-get install -y nginx curlCopy the code

As above, the original Dockerfile1 was used for a while and then changed to Dockerfile2 (only the install line was changed). Because of caching (assuming the previous build’s cache is still there), when Dockerfile2 is built, the update line is not actually executed. Instead, it just grabs the previous cache. The nginx and Curl installed at this point may not be the latest version.

RUN apt-get update && apt-get install -y \ curl \ nginx=1.16.* \ && rm -rf /var/lib/apt/lists/*Copy the code

3. ARG and ENV

Both directives can be used to define variables, but there are a number of caveats to their use:

The ARG before FROM can only be used in FROM. If it is used after FROM, it needs to be redeclared

ARG key=value FROM XXX ${key} XXXX ARG key #Copy the code

The scope of the ARG variable is the instructions after the ARG in the build phase
ENV environment variables range from directives declared in the build phase to directives encoded in the image, and are also used by the container runtime
Variables defined by ARG and ENV cannot be used in CMD and ENTRYPOINT
When the ARG and ENV variables have the same name (regardless of who defined them first), the value of the ENV environment variable overrides the ARG variable
ENV generates an intermediate layer, which is mapped and cannot be removed even with unset, for example:

Echo $ADMIN_USER >./mark RUN unset ADMIN_USER # Using unset simply removes build environment variables, Docker run --rm test sh -c 'echo $ADMIN_USER' mark # run --rm test sh -c 'echo $ADMIN_USER' mark  FROM alpine RUN export ADMIN_USER="mark" \ && echo $ADMIN_USER > ./mark \ && unset ADMIN_USER CMD shCopy the code

4. COPY and ADD

The instructions are almost identical, so use COPY mindlessly when you just want to COPY files from the local context to the image.

Note the following rules when using COPY and ADD:

Note the attributes of the file. You can change the owner and owner group at the same timeCOPY/ADD [--chown=<user>:<group>] <src> <dest>
If it is not clear how directories and backslashes affect these two instructions, it is easier to use backslashes for all directories, e.gCOPY <src_dir>/ <dest_dir>/Because:

If < SRC > is a directory, only all files in the directory will be copied with or without a backslash. To copy the directory itself, use the parent directory of ‘ ‘
If

is a directory, a backslash is required to copy files to dest

< SRC > must be in context and cannot be used../Jump out of the context

In addition to all the functions of COPY, the ADD directive also has the following features, which should be used sparingly if not necessarily:

When < SRC > is a local tar file (a common compressed format), the package is automatically unpacked
< SRC > can be a URL and can be pulled from a remote location

5. CMD and ENTRYPOINT

Here’s another pair of similar commands that need to be used with caution:

CMD, when used alone, specifies the command to be executed by default when the container is started
ENTRYPOINT, when used alone, can completely replace CMD
When ENTRYPOINT is used with CMD, CMD becomes the default parameter of ENTRYPOINT
It is recommended to use the ENTRYPOINT/CMD exec form: i.eENTRYPOINT ["entry.app", "arg"], because shell writing form (ENTRYPOINT entry.app arg) will start an additional shell process

The following table lists the effects of various combinations of CMD and ENTRYPOINT:

In addition, you can specify the actual parameters of ENTRYPOINT by adding fields at the end of the Docker run

CMD ["--help"] # test_entryPoint docker run test_entryPoint # /entry.app --help # run docker run test_entryPoint -a -t #Copy the code

6. multi-stage builds

Releases after Docker 17.05 support a new build approach: multi-stage builds. Unlike the traditional approach, multi-phase builds can use multiple FROM to divide the entire build phase into multiple stages:

By naming different stages, you can manage images of debug, test, Product, and other environments through a single Dockerfile
throughCOPY --from=stage_nameTo copy files from the intermediate stages to the target stage, resulting in a smaller image

For example, the template mentioned above can be optimized with a multi-stage build. Assuming we end up with just Entry.app and its runtime environment, but not its compile environment, we can optimize the size of the resulting image as follows:

# Use multi-stage build, name a Builder stage here, Appbase_image :tag AS Builder ARG arg_key[=default_value1] # ENV env_key=value2 # ENV env_key=value2 COPY SRC DST RUN command1 && Command2... WORKDIR /path/to/work/dir/WORKDIR /path/to/work/dir/WORKDIR /path/to/work/dir/WORKDIR /path/to/work/dir App COPY SRC DST RUN compile_entry_app WORKDIR /path/to/work/dir # Set workdirectory COPY -- FROM = Builder entry.app. CMD ["--options"] # specify the default parameters of the default command when the container is startedCopy the code

Past wonderful

﹀

Rescue of all copies of cepH partial data successively

In those years, the CDN stepped on the pit

Time series prediction in intelligent monitoring

Use d3.js to draw a resource topology

Artificial intelligence in Operation & Maintenance

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Dockerfile Best Practices – Zhihu

A simple Dockerfile reference template

Build an efficient mirror life cycle

1. The image build

2. The mirror pull

Q&A

1. Note that the instructions in Dockerfile are executed one by one and are independent of each other

2. Beware of “excessive” caching

3. ARG and ENV

4. COPY and ADD

5. CMD and ENTRYPOINT

6. multi-stage builds

Dockerfile Best Practices – Zhihu

A simple Dockerfile reference template

Build an efficient mirror life cycle

1. The image build

2. The mirror pull

Q&A

1. Note that the instructions in Dockerfile are executed one by one and are independent of each other

2. Beware of “excessive” caching

3. ARG and ENV

4. COPY and ADD

5. CMD and ENTRYPOINT

6. multi-stage builds

Related Posts

Special Functions extended by Python (lambda, Map, Filter, reduce)

Java TimeUnit Time utility class

Annotated project development! Analyze the role and usage of annotations in Java in detail