A daemon that performs health checks on containers running in containerd. It allows you to register asynchronous health checks for a set of container tasks, and provides a restart mechanism in case of failure.
containerd is an abstraction of kernel features that provide a relatively high-level container interface with limited feature set and typically perform low-level tasks for running a container. Due to its nature of being small and focused, containerd doesn't support health checks like higher-level components such as Docker and Kubernetes. Therefore, this project aims to solve this problem by providing a simple way to monitor containers running in containerd. It lets you specify a health check for each container task and provides restart logic in case of failure.
On Linux, you can install using go get:
go get github.com/vouch-opensource/containerd-healthcheck
Alternatively, you can run it directly by using a Docker image:
docker run -v /run/containerd/containerd.sock:/run/containerd/containerd.sock vouchio/containerd-healthcheck
You can also manually download the binary from the github releases page.
Usage of ./containerd-healthcheck:
-a, --addr string HTTP address for prometheus endpoint (default ":9434")
-c, --config string Path to configuration file (default "config.yml")
-v, --version Print app version
The config.yml
is the required configuration file for the containerd-healthcheck
. It will read the config.yml
file in the current working directory or specified with the --config
option to be used by the daemon.
Example
containerd:
socket: /run/containerd/containerd.sock
namespace: default
checks:
- container_task: example-api
http:
url: 127.0.0.1:8080/health
method: GET
expected_body: "OK"
expected_status: 200
execution_period: 2
initial_delay: 2
threshold: 3
timeout: 5
This section specifies the required information to establish a connection with containerd
socket
: containerd socket path to be used to establish a connection between the client and containerd over GRPCnamespace
: namespace of containerd which the container resources - tasks, images, snapshots - are located
This section is a list of health checks to be performed by the containerd-healthcheck
. Each health check is configured through the following data structure:
container_task:
task name of the container, mostly the same as containerexecution_period
: check interval to be executed (in seconds)initial_delay
: time to delay first check execution (in seconds)restart_delay
: time to sleep after restarting a task (in seconds)threshold
: the number of consecutive of check failures required before considering a target unhealthy and then marked to be restartedtimeout
: Timeout used for the request (in seconds)http.url
: URL to be called by the checkhttp.method
: HTTP methodhttp.expected_body
: Operates as a basic 'body should contains < string >'http.expected_status
: Expected response status code
By default, the containerd-healthcheck
daemon serves a prometheus http endpoint with built-in metrics provided by the go-sundheit library; also it includes the total number of restarts per task running on containerd. Once the daemon is running, it can be accessed by the address 127.0.0.1:9434/metrics
. A custom address can be defined with the --addr
argument as well.