Data Flow: Observability
Purpose: For platform engineers, explains how telemetry data (metrics, logs, traces) flows from workloads to Grafana.
Flow Summary
Components
| Component | Namespace | Signal | Role |
|---|---|---|---|
| OpenTelemetry Collector | observability | All | Receives OTLP data from apps, forwards to Kafka |
| Kafka | observability | All | Event streaming buffer between collectors and backends |
| Prometheus | observability | Metrics | Stores time-series data, evaluates alerting rules |
| Alertmanager | observability | Alerts | Evaluates rules, routes notifications |
| kube-state-metrics | observability | Metrics | Exposes Kubernetes object state as Prometheus metrics |
| Promtail | observability | Logs | DaemonSet that tails container log files |
| Loki | observability | Logs | Log aggregation with label-based indexing |
| Tempo | observability | Traces | Distributed trace storage |
| Grafana | observability | All | Unified query and visualization across all signals |
Sequence
Metrics
- Applications expose a
/metricsendpoint (Prometheus format) or emit OTLP metrics. - OpenTelemetry Collector scrapes or receives metrics.
- Collector exports to Kafka topics.
- Prometheus consumes from Kafka and stores time-series.
- Alertmanager evaluates
PrometheusRuleresources and fires alerts. - Grafana queries Prometheus via PromQL.
Logs
- Containers write to stdout/stderr.
- Promtail (DaemonSet) tails log files, attaches Kubernetes labels.
- Promtail pushes log streams to Kafka.
- Loki consumes from Kafka and indexes logs.
- Grafana queries Loki via LogQL.
Traces
- Applications instrumented with OpenTelemetry SDK export spans via OTLP.
- OpenTelemetry Collector receives, batches, and exports to Kafka.
- Tempo consumes from Kafka and stores traces.
- Grafana queries Tempo via TraceQL.
Related
- Logical Diagram — full cluster architecture