Skip to main content

Managed Kafka (Streaming Blueprint)

Purpose: For platform engineers and app developers, explains the Streaming blueprint — how Apache Kafka is operated as a managed platform service with GitOps lifecycle, built-in observability, and security by default.

Overview

openCenter Managed Kafka delivers Apache Kafka as a platform service using the Strimzi operator. Kafka clusters, topics, users, and ACLs are declared in Git and reconciled by FluxCD. Upgrades are a version bump in YAML with automatic rolling restarts and health checks.

Status: Limited Availability (first paying customers)

Architecture

Deployment Flow

  1. Declare — Kafka CR, topics, users defined in cluster overlay Git repository
  2. Reconcile — FluxCD applies manifests; Strimzi operators provision resources
  3. Secure — TLS certificates issued by cert-manager; secrets SOPS-encrypted in Git
  4. Monitor — Prometheus scrapes JMX metrics; Grafana dashboards deploy automatically
  5. Operate — Rolling upgrades, partition reassignment, backup via standard day-2 operations
  6. Scale — Add brokers via CR spec change; Cruise Control handles partition rebalancing

What You Get

  • Multi-broker Kafka clusters with KRaft mode (no ZooKeeper dependency)
  • Node pools for workload isolation (broker vs. controller roles)
  • Rolling upgrades with automatic health checks
  • Rack awareness for fault-domain distribution
  • Kafka Connect clusters with OCI-based plugin management
  • MirrorMaker 2 for cross-cluster replication
  • HTTP Bridge (REST API for Kafka operations)
  • MQTT Bridge (one-way MQTT 3.1.1 ingestion to Kafka topics)

What Is Not Included

  • Serverless Kafka (no consumption-based model)
  • Native autoscaling (manual scaling via CR changes)
  • Built-in schema registry (planned Q4 2026 as add-on; external deployment possible now)
  • Unlimited connector support (curated connector catalog)
  • Managed runtime for Kafka Streams applications

Four Pillars

GitOps-Managed Lifecycle

  • All Kafka resources (clusters, topics, users, ACLs) declared in Git
  • FluxCD reconciles changes with SOPS decryption at apply time
  • Drift detection flags manual modifications
  • Rollback = revert Git commit

Observability From Boot

  • Prometheus exporters (JMX, Kafka Exporter) deployed with every cluster
  • Pre-built Grafana dashboards for broker health, topic throughput, consumer lag
  • Alerting rules for: under-replicated partitions, ISR shrink, disk usage, no active controller
  • Loki integration for operator and broker logs

Security by Default

  • Inter-broker TLS (cert-manager issued certificates)
  • Client authentication: SASL/SCRAM, mTLS, OAuth2/OIDC (Keycloak)
  • Topic-level ACLs via KafkaUser CR
  • Per-user quotas (produce/consume byte rates, connection limits)
  • NetworkPolicies restricting access to Kafka namespace
  • Secrets SOPS-encrypted in Git, decrypted at reconciliation time
  • Cosign signatures for operator images; SBOM in SPDX-JSON format

Operational Resilience

  • Topic configuration backup
  • Partition reassignment tooling (Cruise Control)
  • Tested recovery runbooks for common failure modes
  • Velero integration for disaster recovery of operator state
  • Change windows and maintenance workflow support

Strimzi Ecosystem Components

ComponentPurpose
Strimzi Kafka OperatorCluster, Topic, User, and Entity operators
Kafka Access OperatorService Binding-style Secrets for application connectivity
Kafka BridgeHTTP 1.1 REST API for produce/consume
MQTT BridgeOne-way MQTT 3.1.1 ingestion to Kafka topics
Drain CleanerAdmission webhook for safe node draining during maintenance
Kafka OAuthOAuth2/OIDC authentication with Keycloak integration
Quotas PluginAggregate broker quotas, storage-aware throttling
Config ProviderKubernetes Secret/ConfigMap integration for Kafka configuration

Capability Matrix

CapabilitySupport LevelNotes
Kafka broker provisioning FullMulti-broker, KRaft, node pools
High availability FullRack awareness, min.insync.replicas
Encryption in transit FullTLS for inter-broker and client connections
Authentication FullSASL/SCRAM, mTLS, OAuth2/OIDC
Authorization (ACLs) FullKafkaUser CR with User Operator
Kafka Connect FullOCI plugin management, connector lifecycle
Cross-cluster replication FullMirrorMaker 2
Observability FullPrometheus + Grafana + Alertmanager + Loki
Infrastructure as Code FullCRDs in Git, FluxCD reconciliation
Air-gap deployment FullAll images mirrorable, signed packages
Operational runbooks FullDocumented day-2 procedures
Topic/user management FullKafkaTopic and KafkaUser CRDs
HTTP Bridge FullREST API for non-native clients
MQTT Bridge FullDevice ingestion via MQTT 3.1.1
Cluster topology options⚠️ PartialSingle-node, production, stretch (multi-AZ)
Autoscaling⚠️ PartialManual scaling; Cruise Control for rebalancing
SLA eligibility⚠️ PartialLimited availability; GA SLA pending
Incident response⚠️ PartialRunbooks available; 24/7 coverage in progress
Schema Registry Not includedPlanned Q4 2026; external deployment possible

Target Audience

  • Platform engineers — deploy and operate Kafka clusters
  • Application developers — consume Kafka via topics, users, and ACLs declared in Git
  • Data engineers — use Kafka Connect and MirrorMaker 2 for data integration

Further Reading