Tracing the History of Distributed Tracing & OTel

Tracing the History of Distributed Tracing & OTel
May 26, 2022
4 min
read
Ken Hamric
Founder
Tracetest

A history of OpenTelemetry Tracing. From Dapper in 2010 to Zipkin, Jaeger, OpenTracing & OpenCensus, see how the industry coalesced around OpenTelemetry.

Share on Twitter
Share on LinkedIn
Share on Reddit
Share on HackerNews
Copy URL

Table of Contents

Get started with Tracetest!

Try Managed Tracetest Today!

Are you sitting comfortably? Then let’s begin our spooky tale, a tale of a time when no engineer could accurately know what had failed in their process, or pinpoint when things went wrong.

## What is Tracing?

To understand tracing, we need to look further back into the history of cloud computing and work through some common terminology. Modern services are often implemented as deeply complex, large-scale distributed systems. Applications can be developed and maintained by different teams and different services often working in different locations and in different programming languages, as well as spanning many machines. Tracing helps us to understand system behavior and detect potential issues. 

At the lowest level, tracing solutions starts with **Software Observability**, a process where you can infer the internal state of a system from the external outputs. Since this is far from an exact science, we like to add **telemetry** to the process to more accurately read and record what individual components are up to - through capturing and measuring data. 

Once a system has telemetry in place, you can start observing what is happening at the system level through **Distributed Tracing**, a process of tracking a single service transaction along its journey through multiple services and components. Distributed tracing very literally means you create a trace to see what has happened at each step. In cloud-native computing, where we’re often working with distributed systems and microservice architecture, distributed traces are an essential part of day-to-day debugging and monitoring.

## Lets Trace the History of Distributed Tracing & OpenTelemetry

the history of OpenTelemetry Tracing

### 2010: Dapper

Way back in 2010, Google published a paper: [Dapper, a Large-Scale Distributed Systems Tracing Infrastructure](https://research.google/pubs/pub36356/). Google had used the project internally for two years before publishing and reported that: “*Dapper’s foremost measure of success has been its usefulness to developer and operations teams*.”

*“Dapper began as a self-contained tracing tool but evolved into a monitoring platform which has enabled the creation of many different services and tools, some of which were not anticipated by its designers.”*

#### 2012: Zipkin

Zipkin, inspired by Dapper, was developed by Twitter and first released as an open source tool project back in 2012. It’s a distributed tracing system that collects and looks-up the timing data needed to troubleshoot latency problems.

#### 2014: Kubernetes

It would be wrong to leave Kubernetes out of this timeline – while it’s not a tracing tool, Kubernetes fundamentally changed computing as we know it. Kubernetes accelerated the development of cloud native projects that enable the distributed systems and microservice architectures that make tracing such an important activity. 

#### 2015: Jaeger

Three years after Zipkin, in 2015, Uber announced the open source release of Jaeger, a distributed tracing system used to monitor, profile, and troubleshoot microservices. 

The project was accepted as the Cloud Native Computing Foundation’s ([CNCF](http://cncf.io/)) 12th hosted project and moved to the graduated project level (the highest one available) in 2019. 

#### 2015: OpenTracing

[OpenTracing](https://www.cncf.io/blog/2016/10/11/opentracing-joins-the-cloud-native-computing-foundation/) was accepted by CNCF as its third hosted project (yes, it joined before Jaeger!) and focused on making loosely-coupled microservices easier to manage with consistent, expressive, vendor-neutral APIs for distributed tracing and context propagation.

#### 2017: OpenCensus

[OpenCensus](https://opencensus.io/), another project from Google, was a set of libraries for various languages that allowed you to collect application metrics and distributed traces, then transfer the data to a backend of your choice in real time. The data could then be analyzed to health check applications and debug problems.

#### 2019: OpenTelemetry

OpenCensus and OpenTracing merged to form [OpenTelemetry](https://opentelemetry.io/) in 2019. OpenTelemetry provides a single, well-supported integration surface for end-to-end distributed tracing telemetry. In 2021, it [released V1.0.0](https://medium.com/opentelemetry/opentelemetry-specification-v1-0-0-tracing-edition-72dd08936978), offering stability guarantees for the tracing portion of clients.

OpenTelemetry is now a fast-growing incubating project at CNCF, with more than 340 forks of the project on GitHub and almost 230 contributors. 

Meanwhile the [OpenTracing](https://opentracing.io/) project has been archived by the Cloud Native Computing Foundation. The maintainers [said at the time](https://medium.com/opentracing/opentracing-has-been-archived-fb2848cfef67): 

*“Archiving OpenTracing was always the project maintainers intention following the merger of OpenTracing & OpenCensus into OpenTelemetry. As OpenTelemetry has reached incubation, OpenTracing is proposed as an archived project as the previous iteration of OpenTelemetry, which should help avoid any end user confusion.”*

As far back as 2010, engineers saw the benefit of basing tests off the rich information provided by a complete, well instrumented, distributed trace. Up until now, however, this value has been hard to leverage. Because we’re a practical bunch, we’re working on something special: a tool that will enable you to increase your test coverage and let non-technical folks write their own specifications. And we want to help you run it as part of your CI/CD gitops pipeline, verifying system interoperability before pushing code to production. Basically, we want you to get more from your investment in tracing.

That’s why we’re working on [Tracetest](https://tracetest.kubeshop.io/).
Tracetest unlocks this value, enabling you to leverage your OpenTelemetry trace data to enable both integration and complex end to end tests, easily.

We’re keen to hear your feedback as we continue to build Tracetest, so let us know what you think. You can [get started here](https://github.com/kubeshop/tracetest) and chat to us anytime on our [Slack channel](https://dub.sh/tracetest-community).