Docker Image Cleaner

A Python package (docker-image-cleaner) and associated Docker image (quay.io/jupyterhub/docker-image-cleaner) to clean up old docker images when a disk is running low on inodes or space.

The script has initially been developed to help installations of BinderHub clean up space on nodes as it otherwise can run out of space and stop being able to build now docker images.

Why?

Container images are one of the biggest consumers of disk space and inodes on kubernetes nodes. Kubernetes tries to make sure there is enough disk space on each node by garbage collecting unused container images and containers. Tuning this is important for binderhub installations, as many images are built and used only a couple times. However, on most managed kubernetes installations (like GKE, EKS, etc), we can not tune these parameters!

This script approximates the specific parts of the kubernetes container image garbage collection in a configurable way.

Requirements

Only kubernetes nodes using the docker runtime are supported. containerd or cri-o container backends are not supported.
The script expects to run in a kubernetes DaemonSet, with /var/lib/docker from the node mounted inside the container. This lets the script figure out how much disk space docker container images are actually using.
The DaemonSet should have a ServiceAccount attached that has permissions to talk to the kubernetes API and cordon / uncordon nodes. This makes sure new pods are not scheduled on to the node while image cleaning is happening, as it can take a while.

How does it work?

Compute how much space /var/lib/docker directory (specified by the DOCKER_IMAGE_CLEANER_PATH_TO_CHECK environment variable) is taking up.
If the disk space used is greater than the garbage collection trigger threshold (specified by DOCKER_IMAGE_CLEANER_THRESHOLD_HIGH), garbage collection is triggered. If not, the script just waits another 5 minutes (set by DOCKER_IMAGE_CLEANER_INTERVAL_SECONDS).
If garbage collection is triggered, the kubernetes node is first cordoned to prevent any new pods from being scheduled on it for the duration of the garbage collection.
Stopped containers are removed via docker container prune.
Dangling images are removed via docker image prune
If no dangling images are found to prune, all images are pruned (docker image prune -a)
After the garbage collection is done, the kubernetes node is also uncordoned.
When done, we wait another 5 minutes (set by DOCKER_IMAGE_CLEANER_INTERVAL_SECONDS), and repeat the whole process.

Configuration options

Currently, environment variables are used to set configuration for now.

Env variable	Description	Default
`DOCKER_IMAGE_CLEANER_NODE_NAME`	The k8s node where the docker image cleaner is running, so it can be cordoned via the k8s api
`DOCKER_IMAGE_CLEANER_PATH_TO_CHECK`	Path to `/var/lib/docker` directory used by the docker daemon	`/var/lib/docker`
`DOCKER_IMAGE_CLEANER_INTERVAL_SECONDS`	Amount of time (in seconds) to wait between checking if GC needs to be triggered	`300`
`DOCKER_IMAGE_CLEANER_DELAY_SECONDS`	Amount of time (in seconds) to wait between deleting container images, so we don't DOS the docker API	`1`
`DOCKER_IMAGE_CLEANER_THRESHOLD_TYPE`	Determine if GC should be triggered based on relative or absolute disk usage	`relative`
`DOCKER_IMAGE_CLEANER_THRESHOLD_HIGH`	% or absolute disk space available (based on `DOCKER_IMAGE_CLEANER_THRESHOLD_TYPE`) when we start deleting container images	`80`
`DOCKER_IMAGE_CLEANER_TIMEOUT_SECONDS`	Request timeout (in seconds) for docker API requests. Pruning images often takes minutes. Default: 300 (5 minutes)

Name		Name	Last commit message	Last commit date
Latest commit History 403 Commits
.github		.github
docker_image_cleaner		docker_image_cleaner
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
RELEASE.md		RELEASE.md
pyproject.toml		pyproject.toml
requirements.in		requirements.in
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Docker Image Cleaner

Why?

Requirements

How does it work?

Configuration options

About

Releases

Packages

Contributors 10

Languages

License

jupyterhub/docker-image-cleaner

Folders and files

Latest commit

History

Repository files navigation

Docker Image Cleaner

Why?

Requirements

How does it work?

Configuration options

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 10

Languages

Packages