In this article, we will build a Docker image, and then instantiate the container, and analyze Docker file storage and DockerFile optimization strategy in detail in the Docker life cycle

Before we get started, let’s introduce a concept called a Union File System. Joint file system is the technical basis of Docker image, which supports the modification of the file system as a single submission to be superimposed layer by layer. Meanwhile, different directories can be mounted to the same virtual file system. Hierarchical storage and inheritance of images are implemented based on this feature.

Below is an official Docker image describing the file system, showing the role of a federated file system in concatenating the image and container layers

Docker supports a variety of joint file systems, including aufs, deviceMapper, overlay, and overlay2. The system version used in this article is debian9.1, and Docker version is 17.062-ce, and overlay2 is used by default.

If you already have a simple idea of the Docker file system, let’s go into action and get a deeper understanding of how tiered file systems are stored.

Image layer

This is a jdK8 basic image created based on Debian system image in the cloud letter privatization project. In order to facilitate reading and analysis, our Dockerfile has been simplified, only retaining the core part of the content

FROM hub.c.163.com/library/debian:stretch
MAINTAINER nim
# download JDKThe ADD/usr/http://10.173.11.100/nim/jdk-8u202-linux-x64.tar.gzlocal/nim/
Unzip the JDK and delete it
RUN tar -xzvf /usr/local/nim/jdk-8u202-linux-x64.tar.gz -C /usr/local/nim/ \
&& rm /usr/local/nim/jdk-8u202-linux-x64.tar.gz
Set environment variables
ENV JAVA_HOME=/usr/localNim/jdk1.8.0 _202 ENV PATH =$JAVA_HOME/bin:$PATH
CMD ["/bin/bash"] According to the image construction, check the construction result. The original basic image is 100M, and the image size after construction is 697M.Copy the code

Image storage

Now let’s take a look at how the build image works in file-level storage. First, we use Docker History to check the image we just built. We can see that the base image occupies 100M, and the two image layers occupy 194MB and 403M.

Next we see a look at the file system, the storage conditions of this environmental use overlay2 Docker image layer to store the default path to/var/lib/Docker overlay2 /, you can see the image storage directory has four directories, of which 110 m corresponding is the foundation of the mirror, The other two are ADD JDK (186M) and decompress the IMAGE layer of the JDK zip (389M).

The l directory contains the soft links of all layers. The soft links use short names to avoid hitting the page size limit when mounting.

Let’s look at the contents of the files in each layer. The basic image layer contains the DIff folder, which stores the current layer content, and the link file, which records the short name.

Next look at the generated content of COPY JDK, the diff folder saves the JDK compressed package, this layer compared to the basic image layer, more than the lower, merged, work three files/folders, where the lower records the lower layer ID (basic image layer), merged directory provides a unified view, The work directory is used in the container layer read/write layer, and the work directory is used to jointly mount the specified working directory, invisible to the user.

The folder structure of the DECOMpressed JDK layer is similar to that of the previous layer. If the space occupied by the COMPRESSED JDK package is 0, it indicates that the JDK package has been deleted.

The size of the image is equal to the sum of all the layers. JDK packages that are removed in subsequent layers still take up storage space. This is not our intention, so this is where the image file optimization comes in. The optimized Dockerfile looks like this

FROM hub.c.163.com/library/debian:stretch
MAINTAINER nim
RUN curl -o /usr/localNim/JDK - 8 u202 - Linux - x64. Tar. Gz http://10.173.11.100/nim/jdk-8u202-linux-x64.tar.gz \ && tar XZVF/usr /local/nim/jdk-8u202-linux-x64.tar.gz -C /usr/local/nim/ \
&& rm /usr/local/nim/jdk-8u202-linux-x64.tar.gz \
&& export JAVA_HOME=/usr/localNim/jdk1.8.0 _202 \ &&export PATH=$JAVA_HOME/bin:$PATH
CMD ["/bin/bash"]Copy the code
With this optimized content, we will talk about the points that can be optimized in time and space when building Docker images:
  1. Combined run statements: Combine the same type of build statements to effectively reduce mirror layering.
  2. Using the image to build cache: time synchronization, basic software installation and other fixed content in the image before the part of processing, image reconstruction using cache, save time;
  3. Cleaning intermediate products: Ensure that the software and compressed packages used during installation are cleaned on the same layer. Otherwise, the image space will still be occupied.
  4. Build statement optimization: For example, ADD can be directly decompressed when processing local files, playing the role of COPY + RUN tar;
  5. Optimize the basic mirror source: domestic universities and large IT enterprises have to create a mirror station, choose a stable and timely mirror station can effectively shorten the construction time;
For example, there are 1 or 3 optimization strategies in the image. Add is replaced with curl, and the decompression and deletion are combined into one layer. The number of layers of Dockerfile is reduced, and JDK installation packages in the middle process are cleaned.

Is building a mirror really about having as few layers as possible? Of course, this is not always the case, especially if earlier versions of the image are not stable or subsequent iterations are frequent. Proper image layering can reduce compile time, reduce error probability, and make dockerfiles more readable. The image can be re-optimized after the stable version is formed.

Mirror metadata

When analyzing a mirror metadata, we focus on three directories

/var/lib/Docker/image/overlay2/imaged/
/var/lib/Docker/image/overlay2/layerdb/
/var/lib/Docker/overlay2/Copy the code
The first directory stores the basic metadata of the mirror, the second directory stores the layered metadata of the mirror, and the third directory is the tiered storage directory mentioned above, which stores the actual layered content. Here’s a practical look at how metadata and stored information are related.

Docker the basic information of the image stored in the/var/lib/Docker/image/overlay2 / imaged/content/sha256 / below, you can according to the Docker image ID at the beginning of this directory to find the corresponding ID file. This file holds the hierarchical file system, build information, associated containers, and more for the image as JSON.

The second directory/var/lib/Docker/image/overlay2 layerdb/sha256 / save hierarchical metadata, each layer metadata directory cache – id, the diff, size information, including cache – id corresponding hierarchical storage layer, Diff Basic metadata information about the associated mirror.

The container layer

Mount the host /opt/yunxin directory to the container /usr/local/yunxin directory

Created after the completion of the container in image storage directory/var/lib/Docker overlay2 / containers will be generated initial layer and reading and writing, and both use the same logo, much more behind the initial layer – init. The initialization layer mainly stores container-related environment information, such as container host name, host host information and domain name service files, when the container environment is initialized. The read-write layer is used to read and write containers. Processes in Docker containers only have write permission on the read-write layer, and only have read permission on the file contents of other layers.

Next, we enter the container operation for a series of operations, and then analyze the read and write layer for file storage and processing according to the results. The following are the operations and corresponding results as well as the actual file storage situation of the read and write layer.

The merged folder in the read/write layer provides a unified view of the consolidated file system mounted to the user.

Next, we start several container instances based on the same image, and then query the usage space of Docker container. Only the first container only takes up 154K due to the modification file above, and the newly started container does not take up extra space. You can see that when a container is created based on the same image, all containers share the content of the image layer, which effectively saves space. The read/write layer only stores changes. Docker uses a copy-on-write policy when operating on image files. Take a look back at the two images in the first section to get a better idea of Docker’s file system.

conclusion

Docker image and container file system knowledge provides theoretical support for image management and operation and maintenance storage management of cloud communication privatized products, but this is only the beginning of in-depth understanding of Docker. With the accumulation of time and the deepening of the privatization of IM, audio and video, VOd and many related products under Yunxin, more modules and mirrors, more customers and demands, and more complex networks and environments are gradually presented to us. Docker, as the basis for the construction of cloud information privatization service, can better optimize the product and carry out operation and maintenance only when we have a deeper understanding of the principle. We hope that we can provide users with more reliable yunxin privatization services, and we also hope to share more knowledge about Docker with you in the following articles.



Netease yunxin Private Cloud

More technical dry goods, welcome to pay attention to the VX public account “netease smart Enterprise technology +”. Series of courses in advance, free gifts, but also direct conversations with CTO.

Listen to netease CTO talk about cutting-edge observation, see the most valuable technology dry goods, learn the latest practical experience of netease. Netease Smart Enterprise technology + will accompany you to grow from a thinker to a technical expert.