Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Road to four nines

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Road to four nines

Learn the journey to achieving 99.99% uptime (four nines). This talk covers reliability engineering practices, monitoring strategies, incident response, and the architectural decisions that enable systems to achieve and maintain exceptional availability.

Avatar for Ilya Kaznacheev

Ilya Kaznacheev

November 27, 2025

More Decks by Ilya Kaznacheev

Other Decks in Technology

Transcript

  1. • repeatable builds • automated deployment • autonomous rollbacks •

    dependency vendoring (local storage) • security and vulnerability monitoring • infrastructure automation (IaC)
  2. • teams and responsibilities • ITSM support system • escalation

    work fl ow • scripts / knowledge base • alerts, monitoring, root-cause analysis • on-call rotation
  3. • redundancy • load management • data replication • fault

    tolerance • self-healing • disaster tolerance • failover & recovery • CI/CD & supply chain