It's 3:37am. Your phone starts buzzing. It doesn't stop. 1000s of alerts. All the things are broken. Where do you even begin?
You freeze.
The infrastructure we're operating are increasingly complex and nuanced. Events at one edge can have unintended and unpredictable effects on the other, and there is no obvious causal relationship. This makes debugging failure hard.
Good alert design is important to lowering the MTTR when our complex infrastructures fail, but what constitutes a "good alert"? Our brains work in unexpected ways, with cognitive biases and priming skewing our perception of reality. It's vitally important to understand how we think and react under pressure when designing alerts and communicating failure.
In this talk, Lindsay will showcase some of the psychological underpinnings you should take into account when designing your alerts, how other industries handle alert design, and what tools are available to increase your operational effectiveness in the face of massive failures today.
Sources used to create this talk:
- http://www.columbiadisaster.info/images/foam_debris_548x627.jpg
- http://upload.wikimedia.org/wikipedia/commons/9/95/Impact-test.jpg
- http://www.youtube.com/watch?v=94J9oVeST0k
- http://www.youtube.com/watch?v=1oBTzbKx0jo
- http://www.flickr.com/photos/frostnova/2268471558
- http://www.flickr.com/photos/buttim/1297081125
- http://www.flickr.com/photos/gsairpics/8318261080
- http://upload.wikimedia.org/wikipedia/commons/thumb/0/0d/Map_Tenerife_Disaster_EN.svg/2000px-Map_Tenerife_Disaster_EN.svg.png
- http://i1.ytimg.com/vi/LSPkRMbyrGc/maxresdefault.jpg
- http://awesomestories.com/images/user/9add18ae4d.jpg
- http://library.mpib-berlin.mpg.de/ft/rh/RH_Fluency_2008.pdf
- http://www.theatlanticwire.com/global/2012/07/final-air-france-447-report-pilots-misunderstood-their-situation/54209/
- http://www.dailymail.co.uk/news/article-2020136/Pierre-Cedric-Bonin-David-Robert-blamed-Atlantic-Ocean-Air-France-crash-killed-228.html
- http://edition.cnn.com/2012/07/05/world/europe/france-air-crash-report/index.html
- http://www.newscientist.com/blogs/onepercent/2012/07/af447-final-report.html
- http://gizmodo.com/5923866/air-france-447-crash-a-result-of-crew-ignoring-alarms
- http://www.flightglobal.com/news/articles/af447-inquiry-grapples-with-stall-warning-enigma-373857/
- http://www.anesthesia-analgesia.org/content/112/1/78.long
- http://www.used-equipment-medical.com/th_sogemed/medias/big/moniteur-drager-kappa-xlt-infinity.jpg
- http://img.medicalexpo.com/pdf/repository_me/68268/zeus-infinity-empowered-83059_5b.jpg
- http://www.flickr.com/photos/quinnanya/5646121120
- http://www.flickr.com/photos/digital-noise/3650559857
- http://en.wikipedia.org/wiki/File:Arterial_kateter.jpg
- http://drugline.org/img/term/venous-catheter-central-15887_1.jpg
- http://riemann.io/howto.html#group-events-in-time