Certificate Rotation
Purpose: For operators, documents certificate and key rotation procedures for Kubernetes cluster certificates, SOPS Age keys, SSH deploy keys, and TLS certificates managed by cert-manager.
What Needs Rotation
| Asset | Managed By | Default Lifetime | Rotation Trigger |
|---|---|---|---|
| K8s component certs (API server, kubelet, etcd) | Kubespray | 1 year | Annual schedule or compromise |
| SOPS Age keys | openCenter CLI | No expiry | Personnel change, compromise, policy |
| SSH deploy keys | openCenter CLI | No expiry | Personnel change, compromise |
| TLS ingress certs | cert-manager + Let's Encrypt | 90 days | Automatic (cert-manager handles renewal) |
| etcd peer/client certs | Kubespray | 1 year | Annual schedule |
Prerequisites
- SSH access to control plane nodes (for Kubespray cert rotation)
opencenterCLI with access to the cluster configurationkubectlaccess to the target cluster- SOPS Age key file at
~/.config/sops/age/keys.txt
Rotate Kubernetes Certificates (Kubespray)
Kubespray generates certificates for all Kubernetes components. These expire after 1 year by default.
Check Certificate Expiry
# On a control plane node
sudo kubeadm certs check-expiration
# Or via openssl
for cert in /etc/kubernetes/ssl/*.crt; do
echo "$cert: $(openssl x509 -enddate -noout -in "$cert")"
done
Rotate Certificates
Run the Kubespray certificate renewal playbook:
cd <kubespray-directory>
ansible-playbook -i inventory/hosts.yaml \
cluster.yml \
--tags=renew_certs \
--become \
-e "{'cert_force_renew': true}"
This renews all certificates and restarts affected services. The operation is rolling — one node at a time — so there is no API downtime for HA clusters (3+ control plane nodes).
Verify After Rotation
# Check new expiry dates
sudo kubeadm certs check-expiration
# Verify API server is reachable
kubectl cluster-info
# Check all nodes rejoin
kubectl get nodes
Rotate SOPS Age Keys
SOPS Age keys encrypt secrets in the GitOps repository. Rotate when team members leave or on a regular schedule.
Rotation Procedure
# 1. Backup current key
opencenter secrets keys backup
# 2. Rotate the local Age key and re-encrypt all SOPS files
opencenter secrets keys rotate --path .
# 3. Verify the rotation succeeded
opencenter secrets keys validate
The rotate command:
- Backs up the old key automatically
- Generates a new Age key pair
- Re-encrypts all SOPS-encrypted files under
--pathwith the new key - Updates
.sops.yamlwith the new public key
Rotate Cluster-Specific Keys
For cluster encryption key rotation (dual-key transition):
# Start dual-key rotation (adds new key, keeps old key active)
opencenter secrets keys rotate --cluster <org>/<cluster> --type age
# After all nodes have the new key, complete the rotation (removes old key)
opencenter secrets keys rotate --cluster <org>/<cluster> --type age --complete
Commit and Push
cd <gitops-directory>
git add .sops.yaml secrets/ applications/
git commit -m "chore: rotate SOPS Age key"
git push
FluxCD will reconcile with the new encrypted secrets. Ensure FluxCD has access to the new decryption key (deploy the new Age key to the cluster's SOPS secret).
Distribute New Key to Cluster
Update the SOPS decryption secret in the cluster:
# Update the age key secret used by FluxCD for decryption
kubectl create secret generic sops-age \
--namespace=flux-system \
--from-file=age.agekey=<path-to-new-key> \
--dry-run=client -o yaml | kubectl apply -f -
Rotate SSH Deploy Keys
SSH keys authenticate Git operations between FluxCD and the Git repository.
Generate New SSH Key
# Generate a new Ed25519 SSH key pair
opencenter secrets keys rotate --cluster <org>/<cluster> --type ssh
Or manually:
ssh-keygen -t ed25519 -C "<org>-<cluster>-<region>" -f ./new-deploy-key -N ""
Update Git Provider
- Add the new public key as a deploy key in your Git provider (GitHub, GitLab, Gitea)
- Grant read access (or read/write if FluxCD needs to push)
Update Cluster Secret
# Update the FluxCD SSH secret
kubectl create secret generic flux-system \
--namespace=flux-system \
--from-file=identity=./new-deploy-key \
--from-file=identity.pub=./new-deploy-key.pub \
--from-file=known_hosts=./known_hosts \
--dry-run=client -o yaml | kubectl apply -f -
# Restart source-controller to pick up new key
kubectl rollout restart deployment/source-controller -n flux-system
Remove Old Key
After verifying FluxCD reconciles with the new key:
# Verify FluxCD can fetch with new key
flux get sources git -n flux-system
# Remove old deploy key from Git provider
TLS Certificate Rotation (cert-manager)
cert-manager handles TLS certificate lifecycle automatically. Certificates issued by Let's Encrypt renew 30 days before expiry.
Check Certificate Status
# List all certificates and their status
kubectl get certificates -A
# Check a specific certificate
kubectl describe certificate <name> -n <namespace>
# Check certificate expiry dates
kubectl get certificates -A -o custom-columns=\
NAMESPACE:.metadata.namespace,\
NAME:.metadata.name,\
READY:.status.conditions[0].status,\
NOT_AFTER:.status.notAfter,\
RENEWAL:.status.renewalTime
Force Certificate Renewal
If a certificate needs immediate renewal:
# Delete the certificate secret to trigger re-issuance
kubectl delete secret <cert-secret-name> -n <namespace>
# cert-manager will detect the missing secret and re-issue
# Monitor the Certificate resource
kubectl describe certificate <name> -n <namespace>
Rotate ClusterIssuer Credentials
When rotating the DNS validation credentials (e.g., AWS Route53 keys for DNS-01 challenges):
# 1. Update credentials in cluster configuration
opencenter cluster edit <cluster-name>
# 2. Re-encrypt secrets
opencenter secrets sync
# 3. Push changes — FluxCD will reconcile the new secret
git add . && git commit -m "chore: rotate cert-manager credentials" && git push
Zero-Downtime Approach
| Asset | Zero-Downtime Strategy |
|---|---|
| K8s certs | Kubespray performs rolling renewal; HA clusters maintain quorum |
| SOPS Age keys | Dual-key period: old key remains valid until --complete |
| SSH deploy keys | Add new key to Git provider before removing old key |
| TLS certs | cert-manager renews before expiry; no gap in coverage |
| etcd certs | Rolling restart via Kubespray; etcd maintains quorum |
Verification Checklist
After any rotation:
# 1. API server accessible
kubectl cluster-info
# 2. All nodes Ready
kubectl get nodes
# 3. FluxCD reconciling
flux get kustomizations -A --status-selector ready=false
# 4. No certificate errors in logs
kubectl logs -n cert-manager deploy/cert-manager --since=10m | grep -i error
# 5. SOPS decryption works
opencenter secrets keys validate
# 6. Secrets sync from Git
flux reconcile kustomization flux-system --with-source
Troubleshooting
- API server unreachable after cert rotation — Check that the kubeconfig references the correct CA. Re-export:
kubectl config view --raw. - FluxCD fails to fetch after SSH key rotation — Verify the new public key is added as a deploy key and
known_hostsis correct for the Git host. - SOPS decryption fails after Age key rotation — Ensure the new key is deployed to the cluster's
sops-agesecret influx-systemnamespace. Checkkubectl logs -n flux-system deploy/kustomize-controller. - cert-manager fails to renew — Check issuer status:
kubectl describe clusterissuer letsencrypt-prod. Verify DNS credentials and rate limits.