As presented at re-deploy 2019.
https://re-deploy.io/2019/speakers/#adrian-hornsby
Abstract:
Mistakes. Bad judgment. Errors. Failures. They are all part of our engineering lives. While many think of them as being undesirable aspects of engineering, failures are very important, and even- beneficial. One thing that is sure is that failures will happen and will come in many forms, some expected, and some unexpected. It’s therefore important to embrace failure. The question is how to limit its blast-radius? In this talk, I will discuss a range of blast radius reduction design techniques used at AWS and by our customers, including isolation, bulkheads, cells, and sharding. I will also discuss how embracing failure infuses impact our operational practices.