are practitioners of Chaos Engineering. We break things on purpose! • We build software that helps engineers build more reliable systems through failure injection. gremlin.com @tammybutow #GoogleNext18
of your stack to increase system resilience. • Injecting failure will also train your engineering teams for on-call. • Include engineers, engineering managers, designers, PMs, TPMs, VPs and more! API APPLICATION CACHING DATABASE OPERATING SYSTEM HARDWARE RACK NETWORK / POWER FULL-STACK CHAOS ENGINEERING
ways. • Be proactive and break them first on purpose with controlled chaos. • Advanced Chaos Engineering involves doing Chaos Engineering with CI/CD @tammybutow #GoogleNext18 CONTROLLED CHAOS ENGINEERING
3. Consider the blast radius 4. Run your Chaos Engineering experiment 5. Measure the results of your experiment 6. Find & fix issues or scale the experiment HOW TO RUN A CHAOS EXPERIMENT
Choose one of these services (e.g. Kafka) 3. Whiteboard the service with your team 4. Select the experiment: resource/state/network 5. Determine the scope: number of machines/impact HOW TO CHOOSE A CHAOS EXPERIMENT
IO consumption • Good to catch problems before they turn into high severity incidents and downtime for customers. • Chaos Engineering enables you to proactively monitor your monitoring for issues. RESOURCE CHAOS ENGINEERING
engineering experiments: • Kill one process • Loop kill a process • Spawn a new process • Fork bomb You can also do Time Travel Chaos Engineering! STATE CHAOS — PROCESS & TIME
the host • Use one container to kill another container • Use one container to kill several containers • Use several containers to kill several containers STATE CHAOS — PODS/CONTAINERS