Back to talks

From Metrics to Root Cause

How to move from metrics and alerts to the actual root cause of production incidents using structured observability: metrics, logs, and traces.

WDI|Poland||45 minutes
Audience: Backend engineers, SREs, Platform engineers, DevOps engineers
observabilitydebuggingproductionmetrics

Observability is not a dashboard. It's a diagnostic process.

This talk explores how to move from "something is wrong" to "here's the fix" using a systematic approach to debugging production systems.

Abstract

Every engineer has been there: alerts fire, dashboards show anomalies, but finding the actual root cause feels like searching for a needle in a haystack. We collect terabytes of metrics, logs, and traces, yet debugging still feels like guesswork.

This talk presents a structured approach to production debugging that turns observability data into actionable insights. We'll explore:

  • Why dashboards alone aren't enough
  • The three questions every debugging session should answer
  • How to correlate signals across metrics, logs, and traces
  • Real-world examples of debugging complex distributed systems

What You'll Learn

By the end of this talk, you'll have a mental framework for approaching any production incident, along with practical techniques for using your observability stack more effectively.

Outline

  1. The Problem with Dashboards - Why visualization isn't investigation
  2. The Diagnostic Mindset - Thinking like a detective
  3. Signal Correlation - Connecting metrics, logs, and traces
  4. Case Studies - Real debugging sessions from production systems
  5. Building Better Alerts - From symptoms to causes

Target Audience

This talk is for anyone who has ever stared at a dashboard wondering "why is this happening?" Whether you're debugging your first production incident or your hundredth, you'll find practical techniques to add to your toolkit.

Interested in this topic for your team or conference?