Skip to main content

Schema Registry (Planned)

Planned — Q4 2026

Schema Registry is on the committed roadmap for Q4 2026 as a Kafka add-on. Requires the Streaming service (Managed Kafka) to be operational.

Purpose: For platform engineers and data engineers, explains the planned Schema Registry service — scope, compatibility modes, integration with Kafka, and deployment model.

Overview

Schema Registry provides schema versioning and compatibility enforcement for Kafka topics. It prevents breaking schema changes from propagating through streaming pipelines by validating schemas at produce time against configured compatibility rules.

Deployed as an add-on to the existing Kafka infrastructure in the data-kafka namespace.

Planned Scope

CapabilityDescription
Schema registrationRegister and version schemas for Kafka topics
Compatibility enforcementValidate new schema versions against compatibility rules
Format supportAvro, Protobuf, JSON Schema
Subject strategiesTopicName, RecordName, TopicRecordName
REST APIStandard Schema Registry API for client integration
GitOps managedRegistry deployment and configuration via FluxCD
MonitoringPrometheus metrics for schema registrations, compatibility checks
TLScert-manager issued certificates
AuthenticationIntegrated with Keycloak (OAuth2/OIDC)

Compatibility Modes

ModeRuleUse Case
BACKWARDNew schema can read data written by previous schemaDefault — safe for consumer upgrades
FORWARDPrevious schema can read data written by new schemaSafe for producer upgrades
FULLBoth backward and forward compatibleStrictest — safe for independent upgrades
NONENo compatibility checkingDevelopment only

Kafka Integration

Schema Registry integrates with the Kafka ecosystem at multiple points:

  • Producers — serialize messages using registered schemas; registry validates compatibility before registration
  • Consumers — deserialize messages using schema ID embedded in message header
  • KafkaConnect — converters use registry for source/sink schema resolution
  • CDC (Debezium) — registers database table schemas automatically on capture

Deployment Model

  • Deployed in data-kafka namespace alongside Kafka operator
  • Backed by Kafka topic for schema storage (internal topic _schemas)
  • HA mode with multiple replicas behind a Service
  • TLS for client and inter-instance communication
  • Access control integrated with Keycloak

Dependencies

DependencyRequiredNotes
Managed KafkaYesBackend storage (internal Kafka topic) + integration target
cert-managerYesTLS certificates
KeycloakRecommendedAuthentication for registry API
kube-prometheus-stackYesMetrics and alerting
FluxCDYesGitOps deployment lifecycle

What Is Not Included

  • Data catalog / metadata management (separate concern; OpenMetadata on long-term roadmap)
  • Schema generation from database DDL (CDC handles this via Debezium)
  • Schema migration tooling (application responsibility)
  • Multi-cluster schema federation (single registry per Kafka cluster)

Further Reading