OpenTelemetry provides the libraries, agents, and other components that you need to capture telemetry from your services so that you can better observe, manage, and debug them. Specifically, OpenTelemetry captures metrics, distributed traces, resource metadata, and logs (logging support is incubating now) from your backend and client applications and then sends this data to backends like Prometheus, Jaeger, Zipkin, and others for processing. OpenTelemetry is composed of the following:
- One API and SDK per language, which include the interfaces and implementations that define and create distributed traces and metrics, manage sampling and context propagation, etc.
- Language-specific integrations for popular web frameworks, storage clients, RPC libraries, etc. that (when enabled) automatically capture relevant traces and metrics and handle context propagation
- Automatic instrumentation agents that can collect telemetry from some applications without requiring code changes
- Language-specific exporters that allow SDKs to send captured traces and metrics to any supported backends
- The OpenTelemetry Collector, which can collect data from OpenTelemetry SDKs and other sources, and then export this telemetry to any supported backend
Most OpenTelemetry components are already in beta and are proceeding to GA release candidates.
What is Observability?
In software, observability typically refers to telemetry produced by services and is often divided into three major verticals:
- Tracing, aka distributed tracing, provides insight into the full lifecycles, aka traces, of requests to the system, allowing you to pinpoint failures and performance issues.
- Metrics provide quantitative information about processes running inside the system, including counters, gauges, and histograms.
- Logging provides insight into application-specific messages emitted by processes.
These verticals are tightly interconnected. Metrics can be used to pinpoint, for example, a subset of misbehaving traces. Logs associated with those traces could help to find the root cause of this behavior. And then new metrics can be configured, based on this discovery, to catch this issue earlier next time. Other verticals exist (continuous profiling, production debugging, etc.), however traces, metrics, and logs are the three most well adopted across the industry.
OpenTelemetry will not initially support logging, though we aim to incorporate this over time.
Where can I read the OpenTelemetry specification?
The spec is available in the open-telemetry/specification repo on GitHub.
I want to help influence the future direction of cloud-native telemetry. What should I do?
Excellent! We list the best ways to get involved on our community GitHub page, including mailing lists, our Gitter channels, the community calendar, and the monthly community meeting.