The original article is reprinted from liu Yue’s Technology blog v3u.cn/a_id_173

In general, when you want to choose a base image for your Python development environment, most people choose Alpine. Why? Because it’s so small, it’s only about 5 MEgabytes (compared to the Ubuntu series, which is closer to 100 MEgabytes), but the truth of the matter is that we didn’t choose the base image just to get a taste of Python syntax. On top of that, we need to debug and install various extensions, possibly many tripartite dependencies. Alpine is not a good choice in this environment, so let’s take a look at installing and compiling Python in Alpine and Ubuntu, respectively.

First pull Alpine and Ubuntu images respectively:

Docker Pull Ubuntu :18.04 Docker Pull AlpineCopy the code

After the drawing, it can be seen that there is indeed a significant difference in volume:

REPOSITORY TAG IMAGE ID CREATED SIZE Ubuntu 18.04 6526a1858e5d 2 weeks ago 64.2MB alpine latest A24bb4013296 3 months A line 5.57 MBCopy the code

Ubuntu takes up 64MB, while Alpine takes up just 5.57 MB.

If your python application needs to do some scientific calculations and display the data graphically, you’ll need the help of the matplotlib and Pandas libraries

FROM python:3.7-slim  
RUN pip install --no-cache-dir matplotlib pandas
Copy the code

Then run the image script:

docker build -f Dockerfile.ubuntu -t 'ubuntu-mat' .
Copy the code

As you can see, the compiled image jumped from 60mb to 263mb.

liuyue:blog liuyue$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE ubuntu-mat latest 401f0425ce63 About a minute ago  263MBCopy the code

There is no problem in use.

Now, let’s try Alpine and see if it has any speed and size advantages over Ubuntu

Write Dockerfile. Alpine:

FROM python:3.7-alpine  
RUN pip install --no-cache-dir matplotlib pandas
Copy the code

Compiling an image script

docker build -f Dockerfile.alpine -t 'alpine-mat' .
Copy the code

During compilation, we will find an error:

Liuyue :blog Liuyue $docker build -f Dockerfile. Alpine -t 'alpine-mat'. Sending build context to Docker daemon 112.1kB Step 1/2: Pull FROM Python :3.7-alpine 3.7-alpine: Pulling FROM library/python df20FA9351A1: Pull complete 36b3ADc4FF6f: Pull complete 4db9de03f499: Pull complete cd38a04a61f4: Pull complete 6bbb0c43b470: Pull complete Digest: sha256:d1375bf0b889822c603622dc137b24fb7064e6c1863de8cc4262b61901ce4390 Status: Downloaded newer Image for Python: 2.7-alpine --> 078114EDb6be Step 2/2: RUN pip install --no-cache-dir matplotlib pandas ---> Running in 6d3c44420e5c Collecting matplotlib Downloading Matplotlib-3.3.1.tar. gz (38.8 MB) ERROR: Command errored out with exit status 1: Command: /usr/local/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-40p0g06u/matplotlib/setup.py'"'"'; __file__='"'"'/tmp/pip-install-40p0g06u/matplotlib/setup.py'"'"'; f=getattr(tokenize, '"'"'open'"'"', open)(__file__); code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"'); f.close(); exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-zk64hzam cwd: /tmp/pip-install-40p0g06u/matplotlib/Copy the code

How did this happen? If you look closely at the Ubuntu-based build above, you will see that it downloads the tripartite-library installation package as Matplotlib-3.1.2-cp38-cp38-ManyLinux1_x86_64. WHL, which is a pre-compiled binary installation package. Alpine can only download the source code (matplotlib-3.1.2.tar.gz), which is Alpine’s fatal problem: the standard Linux installation package doesn’t work on Alpine Linux at all.

Most Linux distributions use the GNU version of the standard C library (Glibc), which is required for almost all C-based scripting languages, including Python. But Alpine Linux uses MUSL, and those binary packages are compiled for Glibc, so Alpine disables Linux package support. Most Python packages now include binary installers on PyPI, greatly speeding up installation times. But if you’re using Alpine Linux, you’ll need to compile all the C source for every Python package you use.

This means you need to figure out the dependencies of each system library yourself. Dockerfile.alpine:

FROM python:3.7-alpine  
RUN apk --update add gcc build-base freetype-dev libpng-dev openblas-dev  
RUN pip install --no-cache-dir matplotlib pandas
Copy the code

Recompile:

docker build -f Dockerfile.alpine -t 'alpine-mat' .
Copy the code

After a long compilation and installation, about half an hour, because we all know that compiling and installing from source is much slower than installing from the installation package, now look at the compiled image:

REPOSITORY                  TAG                   IMAGE ID            CREATED              SIZE  
alpine-mat                  latest                601f0425ce63        About a minute ago   873MB
Copy the code

At 873 megabytes, Alpine’s pride in being small and light has gone.

While Alpine’s MusL kernel is technically mostly compatible with glibc used by other Linux distributions, in practice the difference can cause problems. When these problems do occur, they are not easy to fix. Alpine’s thread default stack size is small, for example, which can cause Python to crash and slow down Python applications.

Conclusion: In a local environment, if you just want to “play around,” there’s nothing wrong with Alpine as a base image, but if you want to deploy your Python application to a production environment, especially if multiple compilations are required to deploy distributed systems, it makes more sense to opt for the older Ubuntu.

The original article is reprinted from liu Yue’s Technology blog v3u.cn/a_id_173