Skip to main content

Certificate Rotation

Purpose: For operators, documents certificate and key rotation procedures for Kubernetes cluster certificates, SOPS Age keys, SSH deploy keys, and TLS certificates managed by cert-manager.

What Needs Rotation

AssetManaged ByDefault LifetimeRotation Trigger
K8s component certs (API server, kubelet, etcd)Kubespray1 yearAnnual schedule or compromise
SOPS Age keysopenCenter CLINo expiryPersonnel change, compromise, policy
SSH deploy keysopenCenter CLINo expiryPersonnel change, compromise
TLS ingress certscert-manager + Let's Encrypt90 daysAutomatic (cert-manager handles renewal)
etcd peer/client certsKubespray1 yearAnnual schedule

Prerequisites

  • SSH access to control plane nodes (for Kubespray cert rotation)
  • opencenter CLI with access to the cluster configuration
  • kubectl access to the target cluster
  • SOPS Age key file at ~/.config/sops/age/keys.txt

Rotate Kubernetes Certificates (Kubespray)

Kubespray generates certificates for all Kubernetes components. These expire after 1 year by default.

Check Certificate Expiry

# On a control plane node
sudo kubeadm certs check-expiration

# Or via openssl
for cert in /etc/kubernetes/ssl/*.crt; do
echo "$cert: $(openssl x509 -enddate -noout -in "$cert")"
done

Rotate Certificates

Run the Kubespray certificate renewal playbook:

cd <kubespray-directory>

ansible-playbook -i inventory/hosts.yaml \
cluster.yml \
--tags=renew_certs \
--become \
-e "{'cert_force_renew': true}"

This renews all certificates and restarts affected services. The operation is rolling — one node at a time — so there is no API downtime for HA clusters (3+ control plane nodes).

Verify After Rotation

# Check new expiry dates
sudo kubeadm certs check-expiration

# Verify API server is reachable
kubectl cluster-info

# Check all nodes rejoin
kubectl get nodes

Rotate SOPS Age Keys

SOPS Age keys encrypt secrets in the GitOps repository. Rotate when team members leave or on a regular schedule.

Rotation Procedure

# 1. Backup current key
opencenter secrets keys backup

# 2. Rotate the local Age key and re-encrypt all SOPS files
opencenter secrets keys rotate --path .

# 3. Verify the rotation succeeded
opencenter secrets keys validate

The rotate command:

  1. Backs up the old key automatically
  2. Generates a new Age key pair
  3. Re-encrypts all SOPS-encrypted files under --path with the new key
  4. Updates .sops.yaml with the new public key

Rotate Cluster-Specific Keys

For cluster encryption key rotation (dual-key transition):

# Start dual-key rotation (adds new key, keeps old key active)
opencenter secrets keys rotate --cluster <org>/<cluster> --type age

# After all nodes have the new key, complete the rotation (removes old key)
opencenter secrets keys rotate --cluster <org>/<cluster> --type age --complete

Commit and Push

cd <gitops-directory>
git add .sops.yaml secrets/ applications/
git commit -m "chore: rotate SOPS Age key"
git push

FluxCD will reconcile with the new encrypted secrets. Ensure FluxCD has access to the new decryption key (deploy the new Age key to the cluster's SOPS secret).

Distribute New Key to Cluster

Update the SOPS decryption secret in the cluster:

# Update the age key secret used by FluxCD for decryption
kubectl create secret generic sops-age \
--namespace=flux-system \
--from-file=age.agekey=<path-to-new-key> \
--dry-run=client -o yaml | kubectl apply -f -

Rotate SSH Deploy Keys

SSH keys authenticate Git operations between FluxCD and the Git repository.

Generate New SSH Key

# Generate a new Ed25519 SSH key pair
opencenter secrets keys rotate --cluster <org>/<cluster> --type ssh

Or manually:

ssh-keygen -t ed25519 -C "<org>-<cluster>-<region>" -f ./new-deploy-key -N ""

Update Git Provider

  1. Add the new public key as a deploy key in your Git provider (GitHub, GitLab, Gitea)
  2. Grant read access (or read/write if FluxCD needs to push)

Update Cluster Secret

# Update the FluxCD SSH secret
kubectl create secret generic flux-system \
--namespace=flux-system \
--from-file=identity=./new-deploy-key \
--from-file=identity.pub=./new-deploy-key.pub \
--from-file=known_hosts=./known_hosts \
--dry-run=client -o yaml | kubectl apply -f -

# Restart source-controller to pick up new key
kubectl rollout restart deployment/source-controller -n flux-system

Remove Old Key

After verifying FluxCD reconciles with the new key:

# Verify FluxCD can fetch with new key
flux get sources git -n flux-system

# Remove old deploy key from Git provider

TLS Certificate Rotation (cert-manager)

cert-manager handles TLS certificate lifecycle automatically. Certificates issued by Let's Encrypt renew 30 days before expiry.

Check Certificate Status

# List all certificates and their status
kubectl get certificates -A

# Check a specific certificate
kubectl describe certificate <name> -n <namespace>

# Check certificate expiry dates
kubectl get certificates -A -o custom-columns=\
NAMESPACE:.metadata.namespace,\
NAME:.metadata.name,\
READY:.status.conditions[0].status,\
NOT_AFTER:.status.notAfter,\
RENEWAL:.status.renewalTime

Force Certificate Renewal

If a certificate needs immediate renewal:

# Delete the certificate secret to trigger re-issuance
kubectl delete secret <cert-secret-name> -n <namespace>

# cert-manager will detect the missing secret and re-issue
# Monitor the Certificate resource
kubectl describe certificate <name> -n <namespace>

Rotate ClusterIssuer Credentials

When rotating the DNS validation credentials (e.g., AWS Route53 keys for DNS-01 challenges):

# 1. Update credentials in cluster configuration
opencenter cluster edit <cluster-name>

# 2. Re-encrypt secrets
opencenter secrets sync

# 3. Push changes — FluxCD will reconcile the new secret
git add . && git commit -m "chore: rotate cert-manager credentials" && git push

Zero-Downtime Approach

AssetZero-Downtime Strategy
K8s certsKubespray performs rolling renewal; HA clusters maintain quorum
SOPS Age keysDual-key period: old key remains valid until --complete
SSH deploy keysAdd new key to Git provider before removing old key
TLS certscert-manager renews before expiry; no gap in coverage
etcd certsRolling restart via Kubespray; etcd maintains quorum

Verification Checklist

After any rotation:

# 1. API server accessible
kubectl cluster-info

# 2. All nodes Ready
kubectl get nodes

# 3. FluxCD reconciling
flux get kustomizations -A --status-selector ready=false

# 4. No certificate errors in logs
kubectl logs -n cert-manager deploy/cert-manager --since=10m | grep -i error

# 5. SOPS decryption works
opencenter secrets keys validate

# 6. Secrets sync from Git
flux reconcile kustomization flux-system --with-source

Troubleshooting

  • API server unreachable after cert rotation — Check that the kubeconfig references the correct CA. Re-export: kubectl config view --raw.
  • FluxCD fails to fetch after SSH key rotation — Verify the new public key is added as a deploy key and known_hosts is correct for the Git host.
  • SOPS decryption fails after Age key rotation — Ensure the new key is deployed to the cluster's sops-age secret in flux-system namespace. Check kubectl logs -n flux-system deploy/kustomize-controller.
  • cert-manager fails to renew — Check issuer status: kubectl describe clusterissuer letsencrypt-prod. Verify DNS credentials and rate limits.