The official Docker Python image in its slim variant—e.g. Python:3.9-slim-buster—is a good base image for most use cases. It’s 41MB to download, 114MB when uncompressed to disk, it gives you the latest Python releases, it’s easy to use and it’s got all the benefits of Debian Buster. A minimal Ubuntu base image modified for Docker-friendliness. Baseimage-docker only consumes 8.3 MB RAM and is much more powerful than Busybox or Alpine. Baseimage-docker is a special Docker image that is configured for correct use within Docker containers. It is Ubuntu, plus. Kaggle Notebooks allow users to run a Python Notebook in the cloud against our competitions and datasets without having to download data or set up their environment. This repository includes our Dockerfiles for building the CPU-only and GPU image that runs Python Notebooks on Kaggle. Setting up your own private Docker image repository is very important for many reasons. With your private Docker image repository, you can: Keep the images you download from Docker Hub to your private Docker image repository for future use. Keep the Docker custom images that you’ve built on your private Docker image repository.
When you’re building a Docker image for your Python application, you’re building on top of an existing image—and there are many possible choices.There are OS images like Ubuntu, and there are the many different variants of the python
base image.
Which one should you use?Which one is better?There are many choices, and it may not be obvious which is the best for your situation.
So to help you make a choice that fits your needs, in this article I’ll go through some of the relevant criteria, and suggest some reasonable defaults that will work for most people.
What do you want from a base image?
There are a number of common criteria for choosing a base image, though your particular situation might emphasize, add, or remove some of these:
- Stability: You want a build today to give you the same basic set of libraries, directory structure, and infrastructure as a build tomorrow, otherwise your application will randomly break.
- Security updates: You want the base image to be well-maintained, so that you get security updates for the base operating system in a timely manner.
- Up-to-date dependencies: Unless you’re building a very simple application, you will likely depend on operating system-installed libraries and applications (e.g. a compiler).You’d like them not to be too old.
- Extensive dependencies: For some applications less popular dependencies may be required—a base image with access to a large number of libraries makes this easier.
- Up-to-date Python: While this can be worked around by installing Python yourself, having an up-to-date Python available saves you some effort.
- Small images: All things being equal, it’s better to have a smaller Docker image than a bigger Docker image.
The need for stability suggests not using operating systems with limited support lifetime, like Fedora or non-LTS Ubuntu releases.
Why you shouldn’t use Alpine Linux
A common suggestion for people who want small images is to use Alpine Linux, but that can lead to longer build times, obscure bugs, and performance issues.
You can see the linked article for details, but I recommend against using Alpine.
Option #1: Ubuntu LTS, RedHat Universal Base Image, Debian
There are three major operating systems that roughly meet the above criteria (dates and release versions are accurate at time of writing; the passage of time may require slightly different choices).
- Ubuntu 20.04 was released in April 2020, and since it’s a Long Term Support release it will get security updates until 2025.It’s usable in Docker via the
ubuntu:20.04
image. - RedHat Enterprise Linux 8 was released in May 2019, and will have full updates until 2024 and maintenance updates until 2029.The RedHat Universal Base Image allows you to use it as a Docker base image.
- Debian 10 (“Buster”) was released on July 2019, and will be supported until 2024.It’s usable in Docker via the
debian:10
image.
Previous versions of this article covered CentOS, but CentOS is no longer a long-term stable operating system.
None of these operating systems includes the latest version of Python, Python 3.9, so you’ll have to install it yourself.
Option #2: The Python Docker image
Another alternative is Docker’s own “official” python
image, which comes pre-installed with multiple versions of Python (3.7
, 3.8
, 3.9
, etc.), and has multiple variants:
- Alpine Linux, which as I explained above I don’t recommend using.
- Debian Buster, with many common packages installed. The image itself is large, but the theory is that these packages are installed via common image layers that other official Docker images will use, so overall disk usage will be low.
- Debian Buster
slim
variant. This lacks the common packages’ layers, and so the image itself is much smaller, but if you use many other Docker images based off Buster the overall disk usage will be somewhat higher.
For comparison purposes, the download size of python:3.9-slim-buster
is 41MB, and python:3.9-alpine
is 16MB.Their uncompressed on-disk sizes are 114MB and 44MB respectively.
So which should you use?
If you’re a RedHat shop, you’ll want to use their image.
Download A Docker Image File
Otherwise, as of January 2021 ubuntu:20.04
has the most up-to-date system packages.In practice, Debian’s packages won’t make much of a difference to most users, and the Debian-based official Python Docker images also give you the full range of Python releases.The base OS of Ubuntu 20.04 includes Python 3.8, but 3.9 is available via focal-updates
packages so it is installable.However, as of early February 2021 it was on version 3.9.0, not 3.9.1, and getting pip
installed was a pain.
The official Docker Python image in its slim variant—e.g. python:3.9-slim-buster
—is a good base image for most use cases. it’s 41MB to download, 114MB when uncompressed to disk, it gives you the latest Python releases, it’s easy to use and it’s got all the benefits of Debian Buster.
Download A Docker Image From Ecr
If you care about performance, you’ll want to use ubuntu:20.04
.Having run some benchmarks comparing multiple Python builds, it turns out that switching to Ubuntu 20.04 can give you a 20% performance boost.As such, if Python performance matters to you, I would recommend using the ubuntu:20.04
image.It’s a bit more annoying to set up, and for some reason gets updates less often, but it will be faster.