ETCD CronJob Tools
Kubernetes CronJobs and Docker image for automated etcd snapshot backups and control-plane node config backups.
I expect that you already know the role of ETCD in your kubernetes cluster.
Here you can find a repository with a chewed down system to make a backup, defragment and also a configuration backup from control-plane nodes.
It is a comprehensive example repository, that fully works, that can be adapted to your case.
The key changes are:
- You need to create a copy from each YAML to one of your control plane nodes;
- You need to set affinity to your control-plane nodes:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s-cluster-ctrl-00- The node CFG backup and ETCD backup need to be copied to each of you control-plane nodes.
It is not enough to only make backups from one or another control-plane nodes, you will need to make this actions to each node because when any of them crashes, the likelyhood to recover with the copies from another node is poor, it might work but it is not the best prospect, if you doing backups, make for all of them.
- Each control-plane node that will be configured to execute backups, will need to be mounting a remote location to write the backups into. In my case I use NFS, you can probably use any other method you will, as long as the pods will be able to write in the location.
I have added images from my registry into these manifests, which I will keep updated as mush as I can, but the repository provides the complete recipe to also create the container in a Gitlab pipeline style, you can obvisouly adap to whatever situation you might have.
Components
Docker Image
Alpine-based image with etcdctl installed. Built via CI with a pinned ETCD_VERSION.
Default environment (can be overridden in CronJob spec):
| Variable | Default |
|---|---|
ETCDCTL_ENDPOINTS |
https://127.0.0.1:2379 |
ETCDCTL_CACERT |
/etc/kubernetes/pki/etcd/ca.crt |
ETCDCTL_CERT |
/etc/kubernetes/pki/etcd/healthcheck-client.crt |
ETCDCTL_KEY |
/etc/kubernetes/pki/etcd/healthcheck-client.key |
CronJobs
| File | Name | Schedule | What it does |
|---|---|---|---|
etcd-backup-cronjob-00.yml |
etcd-backup-ctrl-00 |
Every 6h | etcd snapshot → /nfs_storage/backup/k8s-cluster/ctrl-00/ |
node-cfg-backup-cronjob-00.yml |
node-cfg-backup-ctrl-00 |
Daily 01:00 | Tars /etc/kubernetes/{*.conf,manifests,pki} → same NFS path |
backup-cleanup-cronjob.yaml |
backup-cleanup |
Every 6h | Deletes backups older than 2 days from /nfs_storage/backup/k8s-cluster/ |
All jobs run with hostNetwork: true and tolerate control-plane and memory-pressure taints. Timezone: Europe/Amsterdam.
Node affinity:
etcd-backup-ctrl-00andnode-cfg-backup-ctrl-00pin tok8s-cluster-ctrl-00viakubernetes.io/hostname.backup-cleanupruns on any control-plane node.
Backup retention: 2 days (enforced by the cleanup job).
CI/CD
Pipeline stages: lint → build → update-manifest
| Job | Trigger | Action |
|---|---|---|
yamllint |
push | Lints all *.yaml files |
etcd-backup-build |
manual | Builds and pushes etcd-backup:$ETCD_VERSION to the registry |
update-manifest |
manual (after build) | Updates image tag in etcd-backup-cronjob-00.yml and commits |
To release a new etcd version, update ETCD_VERSION in .gitlab-ci.yml, then run etcd-backup-build and update-manifest manually.
Usage
Build image locally
docker build --build-arg ETCD_VERSION=v3.6.5 -t etcd-backup:v3.6.5 .Deploy CronJobs
kubectl apply -f etcd-backup-cronjob-00.yml
kubectl apply -f node-cfg-backup-cronjob-00.yml
kubectl apply -f backup-cleanup-cronjob.yamlTrigger a backup manually
kubectl create job etcd-backup-manual --from=cronjob/etcd-backup-ctrl-00Prerequisites
- NFS mount at
/nfs_storage/backup/k8s-cluster/present on control-plane node(s) - etcd PKI certs readable at
/etc/kubernetes/pki/etcd/on the target node - Registry credentials configured in GitLab CI (
CI_REGISTRY_USER,CI_REGISTRY_PASSWORD)


