Understanding system failures traditionally starts with looking at a single component in isolation. However, this approach does not provide sufficient information with distributed services architectures. In these systems, end-user requests traverse dozens of components, and therefore a new approach is needed.
In this talk we’ll look at distributed tracing, which summarizes and contextualizes all sides of the story into a well-scoped and shared timeline. We’ll also look at distributed tracing tools, like Zipkin, which highlight the relationship between components, from the very top of the stack to the deepest aspects of the system.