process that relies on the repetition of a very short development cycle: first the developer writes an (initially failing) automated test case that defines a desired improvement or new function, then produces the minimum amount of code to pass that test, and finally refactors the new code to acceptable standards.”
resources by creating and shutting down servers as required by network load. • Since you “pay for what you use”, there is no need to buy expensive hardware up-front.
X, which broke feature Y?” • Hard to create dev/QA/staging servers which are identical to production ones. Leads to “Works in my environment” • Doesn't scale if our application gets popular, how do we handle 1M users? What about 10M? One sysadmin can't configure 1k servers over the weekend.
is defined by custom made software and stored in a repository • Run this software any number of times and you'll get a clone of your infrastructure, all in an automated way • Building and maintaining a modern server begins to look a lot like managing a software project
Change control: “git log” to view the latest changes to a server • No regressions: Apply TDD your infrastructure development process and you'll know when a new change adds a regression • Easily move to a previous version: “git checkout <revision>; fab deploy”
such as: – SDLC applied to the infrastructure – Concepts like classes, refactoring, coding standards, etc. – Test Driven Development (optional but highly recommended)
finds a bug in the SSH, he can share the new configuration with other teams. • It's code, re-use it. All teams can contribute on basic OS configuration, specific teams on DB, Web, etc.
Fast load speeds • Easy to add new pages and blog posts • Well documented • Easy to develop new features and test them locally • Fail gracefully (if hacked, DoS'ed, etc.) • Learn something in the process
etc. since Fabric is Python, I decided to use that for my deployments. • “import unittest” for writing the tests • Boto for interacting with the ec2 API
code • 15 configuration files for Apache, Varnish, etc. The result • A new ec2 instance with an elastic IP address • Fully configured (secure) Varnish, Apache, PHP, MySQL with all data loaded in the database • Fully configured CDN • Performance tuning
requirements and make sure the code they write covers them • With infrastructure as code we can create security requirements to make sure the OS and application are secure
yet available) deploy a development server in a VM and manually reproduce the bug • Write unittest to reproduce it • Change configuration / application code to fix it • Run test to verify fix • Run all tests to verify there are no regressions • Commit/Push changes to Git • Apply changes to production environment using Fabric
the way we want, and is secure according to our tests, it's a good idea to run them periodically (once every X hours) ProTip #1: unittests need to be idempotent. ProTip #2: Jenkins for web CI
Word document (that nobody reads) and states: “All passwords need to be X chars long” With infrastructure as code we can make sure this is actually enforced: • Write a unittest that verifies the configured password length • Make it mandatory to run on all servers • Our unittest could also run john the ripper to verify that passwords are strong enough
the number of infrastructure vulnerabilities, bugs, and increase uptime. • Using TDD in your infrastructure code reduces regressions • Requires skilled sysadmins • Migration to infrastructure as code is time- consuming