Uninstall / Decommission
Purpose: For operators, documents the safe removal of openCenter components and full cluster decommissioning, including infrastructure teardown and Git cleanup.
Prerequisites
- Cluster admin access (
kubectlconfigured) - SSH access to nodes (for drain/cleanup)
- Terraform/OpenTofu state accessible (for infrastructure destruction)
- Git access to the customer GitOps repository
Decommission Checklist
Complete this checklist in order. Each step is destructive and non-reversible after infrastructure teardown.
- Notify stakeholders and schedule maintenance window
- Back up cluster state (Velero full backup)
- Export any data from PersistentVolumes that must be retained
- Remove FluxCD (stops reconciliation)
- Remove platform services
- Drain and cordon nodes
- Remove Kubernetes
- Destroy infrastructure (VMs, networks, load balancers)
- Clean up Git repositories
- Revoke secrets and credentials
Step 1 — Backup Before Decommission
# Create a final backup
velero backup create final-backup-$(date +%Y%m%d) \
--include-namespaces '*' \
--wait
# Verify backup completed
velero backup describe final-backup-$(date +%Y%m%d)
# Export the backup metadata
opencenter cluster validate --generate-debug-config
Step 2 — Remove FluxCD
Suspending FluxCD prevents it from recreating resources you delete.
# Suspend all Kustomizations
flux suspend kustomization --all -n flux-system
# Delete FluxCD Kustomizations (this removes managed resources if prune=true)
# Use --prune=false flag to avoid cascading deletes
kubectl delete kustomizations.kustomize.toolkit.fluxcd.io --all -n flux-system
# Remove FluxCD controllers
flux uninstall --silent
# Verify
kubectl get namespace flux-system
# Should show Terminating or NotFound
Step 3 — Remove Platform Services
Remove services in reverse dependency order:
# Remove monitoring stack
kubectl delete namespace monitoring --timeout=120s
# Remove Keycloak
kubectl delete namespace keycloak --timeout=120s
# Remove cert-manager
kubectl delete namespace cert-manager --timeout=120s
# Remove Kyverno
kubectl delete namespace kyverno --timeout=120s
# Remove network services
kubectl delete namespace metallb-system --timeout=120s 2>/dev/null || true
# Remove remaining platform namespaces
for ns in $(kubectl get namespaces -l opencenter.io/managed=true -o name); do
kubectl delete $ns --timeout=120s
done
# Remove CRDs installed by platform services
kubectl get crds -o name | grep -E '(fluxcd|cert-manager|kyverno|metallb)' | xargs kubectl delete
Step 4 — Drain Nodes
# Cordon all worker nodes
kubectl get nodes -l '!node-role.kubernetes.io/control-plane' -o name | \
xargs -I{} kubectl cordon {}
# Drain workers (evicts pods)
kubectl get nodes -l '!node-role.kubernetes.io/control-plane' -o name | \
xargs -I{} kubectl drain {} --ignore-daemonsets --delete-emptydir-data --timeout=300s
Step 5 — Remove Kubernetes
On each node (or via Ansible from the bastion):
# Reset kubeadm
kubeadm reset -f
# Remove Kubernetes packages
apt-get remove -y kubeadm kubelet kubectl 2>/dev/null || \
yum remove -y kubeadm kubelet kubectl 2>/dev/null || true
# Clean up directories
rm -rf /etc/kubernetes /var/lib/kubelet /var/lib/etcd /etc/cni /opt/cni
rm -rf /var/lib/calico /var/run/calico
# Remove iptables rules
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
Using Kubespray's reset playbook (from the bastion):
cd ~/prod-cluster-gitops/infrastructure/clusters/prod-cluster
ansible-playbook -i inventory/hosts.yaml \
kubespray/reset.yml \
--become --become-user=root
Step 6 — Destroy Infrastructure
For OpenStack (OpenTofu-managed):
cd ~/prod-cluster-gitops/infrastructure/clusters/prod-cluster
# Review what will be destroyed
tofu plan -destroy
# Destroy (type 'yes' when prompted)
tofu destroy
This removes:
- VMs (control plane, workers, bastion)
- Networks, subnets, routers
- Security groups
- Floating IPs
- Load balancers (Octavia)
- Volumes
For VMware (pre-provisioned VMs), decommission VMs through vCenter or your VM lifecycle tool.
Step 7 — Clean Up Git Repositories
# Option A: Delete the customer GitOps repository entirely
# (if it was cluster-specific)
gh repo delete myorg/prod-cluster-gitops --yes
# Option B: Remove the cluster overlay from a shared repo
cd ~/shared-gitops
git checkout -b decommission/prod-cluster
rm -rf applications/overlays/prod-cluster
rm -rf infrastructure/clusters/prod-cluster
git add . && git commit -m "chore: decommission prod-cluster"
git push -u origin decommission/prod-cluster
# Open PR and merge
Step 8 — Revoke Secrets and Credentials
# Revoke the SOPS Age key (mark as compromised/retired)
# The key at ~/.config/opencenter/clusters/secrets/<org>/<cluster>/age/keys/ can be deleted
# Revoke OpenStack application credentials
openstack application credential delete prod-cluster-cred
# Revoke SSH keys
# Remove the cluster SSH key from authorized_keys on any shared infrastructure
# Revoke Git deploy keys
# Remove from repository settings (GitHub/GitLab)
# Remove local cluster configuration
opencenter cluster delete prod-cluster
Verification
After decommission:
# Verify VMs are gone (OpenStack)
openstack server list | grep prod-cluster
# Should return empty
# Verify networks are gone
openstack network list | grep prod-cluster
# Should return empty
# Verify DNS records removed
dig +short *.prod-cluster.example.com
# Should return NXDOMAIN
# Verify local config cleaned
opencenter cluster list | grep prod-cluster
# Should not appear