In this repository resides the configuration for my new homelab and some of my VPS hosts.
Technologies used:
- NixOS for the declarative OS and service configuration
- deploy-rs to deploy my NixOS configuration
- Terraform for the automated DNS and OVH configuration
- Backblaze for the cheap object storage
- Ansible for imperative configuration management with tasks, roles, etc.
- OKD is an OpenShift (Kubernetes) distribution without license requirements
- Ovirt and Ovirt Node is used on my single server to host my Kubernetes deployment
TODO: https://plantuml.com/nwdiag
- Create a VPS at
$HOSTING_PROVIDER
nixos-infect
the VPS
Enter the shell with nix run .#terraform-fhs
to get access to all the required
variables and to be able to use the terraform-provider-b2
.
- Get an API token from OVH and update secrets/ovh.yaml
- Get an API token from Cloudflare and update secrets/cloudflare.yaml
- Get an API token from Backblaze and update secrets/backblaze.yaml
- Create a Backblaze bucket and application key for that bucket for the Terraform state and update secrets/terraform-backend.yaml
All required secrets keys are public in the appropriate SOPS file in
secrets/
(but not their values).
cd terraform && terraform init
$ sops -i secrets/cloudflare.yaml
edit stuff in $EDITOR
:wq
File is encrypted inline
$ sops exec-env secrets/some-file bash
bash-4.4$
The Terraform state is managed outside of the repository in a B2 bucket.
Terraform needs to be run from the FHS provided in the flake default package because the Backblaze B2 provider extracts a binary embedded in its binary and the paths needs to be =patchelf=d.
nix-shell
will spawn you in the FHS with the required packages for the
Terraform B2 plugin to work.
Documentation about capabilities: https://www.backblaze.com/b2/docs/application_keys.html
Retention settings for the dovecot email bucket: 30 days
Since I have a single server with 56 cores (vCPU) and 200GB of RAM, I deploy my kubernetes cluster using OpenShift’s free OKD distribution on a single oVirt Node host.
In order to deploy OKD, the oVirt node has to be setup alongside a storage device. Afterwards, HostedEngine can be deployed in oVirt.
- Make sure to generate an ssh private key (with no password) for the root user.
- Then, as the root user on the ovirt node, ssh on the ovirt node (
ssh root@localhost
) This will add the ovirt node to the sshknown_hosts
file.
- Then, as the root user on the ovirt node, ssh on the ovirt node (
- Create a raid device
- Setup glusterfs through Cockpit’s UI to use the raid device
- Add a brick for the kubernetes cluster with at least 500GB of space.
HostedEngine needs to be deployed and should use the configured glusterfs storage.
- Deploy a hyperconverged HostedEngine VM through Cockpit’s UI
- Add a new user for the OCP deployment
- Go to the Keycloak admin interface
- Create a user named
kubernetes@ovirt
- Set this user’s password and save it
- Update the password secret in the `install-config.yaml` configuration
- Connect as the
kubernetes@ovirt
user on the Ovirt Portal - Add the following permissions to the
kubernetes@ovirt
user underAdministration > Users
:ClusterAdmin
DiskCreator
DiskOperator
TemplateCreator
TemplateOwner
UserTemplateBasedVm
The steps to deploy OKD are the following:
-
Configure a DHCP server to allocate IP addresses for the nodes
-
Configure DNS entries
-
Install the
openshift-install
CLI -
Generate the install configuration and manifests
-
Patch the generated manifests
-
Create the cluster with the patched manifests
-
Configuring OKD dns entries
OKD requires the following DNS entries during the bootstrap phase and after.
*.apps.{cluster}.baseDomain
: points to the Haproxy LoadBalancer IPapi-internal.{cluster}.baseDomain
: points to a virtual IP for the API serverapi.{cluster}.baseDomain
: ditto
-
Installing the
openshift
deployment CLIIn order to deploy OKD, the
openshift-install
cli will need to be fetched from the official repository and unpacked.The cli is also available as a Nix derivation in the
Flake.nix
. It is automatically available when using direnv. -
Generating the install configuration and manifests
-
Generate the base configuration with
openshift-install create configs --dir .
-
Fetch the latest Tigera manifests from here and add them to a folder named
calico
.The script below provides an automated way of creating the
kustomization.yaml
for the Calico/Tigera manifests.$ mkdir -p calico # Copy the block of code and run it through this $ wl-paste | awk 'gsub(/manifests/, "calico", $4)' > script.sh $ cat script.sh $ bash script.sh $ resources="$(find calico -type f -printf '%f\0' | sort -z | xargs -r0 printf '- ./%s\n')" $ cat <<EOF >calico/kustomization.yaml apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization commonAnnotations: qt.rs/installer-dir: manifests resources: $resources EOF
-
Generate the openshift manifests with
openshift-install create manifests --dir .
This may consume theopenshift-install.yaml
file. -
Generate a
kustomization.yaml
file for the manifests inmanifests
andopenshift
-
-
Creating the cluster
-
Generate the final resources
$ mkdir -p bootstrap/install-dir $ kustomize build --enable-alpha-plugins bootstrap | ./slice.py -o bootstrap/install-dir
Make sure the file
manifests/cluster-config.yaml
exists. -
Begin the installation
Make sure to delete the file
install-config.yaml
in the installation directory or to move it out of theinstall-dir
folder.The hidden file
.openshift_install_state.json
MUST exist otherwise the installer will not use ANY generated manifests.The installation directory should look like this:
install-dir ├── .openshift_install_state.json ├── manifests │ ├── 00-namespace-tigera-operator.yaml │ ├── 01-cr-apiserver.yaml │ ├── 01-crd-apiserver.yaml │ ├── 01-crd-imageset.yaml │ ├── 01-crd-installation.yaml │ ├── 01-crd-tigerastatus.yaml │ ├── 01-cr-installation.yaml │ ├── 02-configmap-calico-resources.yaml │ ├── 02-rolebinding-tigera-operator.yaml │ ├── 02-role-tigera-operator.yaml │ ├── 02-serviceaccount-tigera-operator.yaml │ ├── 02-tigera-operator.yaml │ ├── 04-openshift-machine-config-operator.yaml │ ├── cluster-config.yaml │ ├── cluster-dns-02-config.yml │ ├── cluster-infrastructure-02-config.yml │ ├── cluster-ingress-02-config.yml │ ├── cluster-network-01-crd.yml │ ├── cluster-network-02-config.yml │ ├── cluster-proxy-01-config.yaml │ ├── cluster-scheduler-02-config.yml │ ├── configmap-root-ca.yaml │ ├── crd.projectcalico.org_bgpconfigurations.yaml │ ├── crd.projectcalico.org_bgppeers.yaml │ ├── crd.projectcalico.org_blockaffinities.yaml │ ├── crd.projectcalico.org_caliconodestatuses.yaml │ ├── crd.projectcalico.org_clusterinformations.yaml │ ├── crd.projectcalico.org_felixconfigurations.yaml │ ├── crd.projectcalico.org_globalnetworkpolicies.yaml │ ├── crd.projectcalico.org_globalnetworksets.yaml │ ├── crd.projectcalico.org_hostendpoints.yaml │ ├── crd.projectcalico.org_ipamblocks.yaml │ ├── crd.projectcalico.org_ipamconfigs.yaml │ ├── crd.projectcalico.org_ipamhandles.yaml │ ├── crd.projectcalico.org_ippools.yaml │ ├── crd.projectcalico.org_ipreservations.yaml │ ├── crd.projectcalico.org_kubecontrollersconfigurations.yaml │ ├── crd.projectcalico.org_networkpolicies.yaml │ ├── crd.projectcalico.org_networksets.yaml │ ├── cvo-overrides.yaml │ ├── kube-cloud-config.yaml │ ├── openshift-kubevirt-infra-namespace.yaml │ ├── secret-machine-config-server-tls.yaml │ └── secret-pull-secret.yaml └── openshift ├── 99_openshift-cluster-api_master-machines-0.yaml ├── 99_openshift-cluster-api_master-machines-1.yaml ├── 99_openshift-cluster-api_master-machines-2.yaml ├── 99_openshift-cluster-api_worker-machineset-0.yaml ├── 99_openshift-machineconfig_99-master-ssh.yaml ├── 99_openshift-machineconfig_99-worker-ssh.yaml ├── 99_role-cloud-creds-secret-reader.yaml ├── openshift-install-manifests.yaml ├── secret-kubeadmin.yaml ├── secret-master-user-data.yaml ├── secret-ovirt-credentials.yaml └── secret-worker-user-data.yaml $ openshift-install create cluster --dir install-dir --log-level=debug DEBUG ..... INFO Consuming Install Config from target directory
-
- Customizing HAProxy error code response pages
- Enabling HTTP Strict Transport Security per-route
- Creating a route through an Ingress object
- Installing a specific version of an Operator
This requires the kustomize-sops plugin. This plugin is automatically exposed in the flake shell.
To encrypt a secret: sops -i -e k8s/something/overlays/prod/secrets/some-secret
for instance.
-
Technologies
- Kustomize to scaffold, modify and apply patches on top of external resources
- Cert-manager to manage certificates
- metallb to manage ip allocation in the cluster
- calico as the CNI
- ExternalDNS to expose DNS records to Cloudflare
- OKD provides the monitoring stack and a CSI driver with the oVirt deployment
-
Kustomize
kustomize build --enable-alpha-plugins something/overlays/prod kustomize build --enable-alpha-plugins something/overlays/prod | kubectl apply -f -
-
Deployment
- Deploy metallb
- Deploy external-dns
- Deploy cert-manager
- Deploy Traefik (has a dependency on Cert Manager and MetalLB)
- Deploy CSI
- Deploy the remaining resources
-
ExternalDNS
ExternalDNS automatically inserts CNAME entries pointing to
k8s.qt.rs
for each ingress defined and annotated withexternal-dns.alpha.kubernetes.io/target: k8s.qt.rs
.While I could use a generic
*
CNAME entry that points tok8s.qt.rs
, I prefer having unresolvable domains. Also, when my ISP will properly support IPv6, I will addAAAA
records andCNAME
records for the IPv4 scenario.Domains that only allow access by administrators (myself) are gated behind an OAuth middleware in Traefik.
-
CSI
-
Setup
Follow the democratic-csi documentation here: https://github.com/democratic-csi/democratic-csi
TL;DR:
-
Get an API key for a user with enough privileges (root for instance)
-
Configure iSCSI in TrueNAS interface
-
Set iSCSI authentication method to CHAP or Mutual-CHAP
-
Set the username/password combination used in the previous step in the
node-stage-secret.yaml
file -
Set the API key from the first step in the
driver-config-iscsi.yaml
file -
Deploy
$ kubectl kustomize --enable-alpha-plugins ./overlays/prod | kubectl create -f- --save-config
-
-
Using deploy-rs, deploy .#mouse --auto-rollback=false
for instance.
I host different services on my NixOS VMs.