There's a lot of hard problems in software, but how many of them are created from decisions we make day to day? Let's talk about pragmatism and a number of difficult, but common situations where we often over-engineer solutions.
of software engineering 12 years in Python • Scaled Disqus (RIP) to 1 billion page views One page view is roughly one rendered embed • Over-engineered a multi-thousand node continuous integration platform at Dropbox It’s what you do when your test suite is Too Damn Slow™ • Scraped together Sentry on a budget sentry.io receives 2 billion exceptions/day as of Oct 2018 • I mash keys at 160 wpm An important strength of being able to duct tape, quickly
reduce disk IO, which is a common bottleneck 2. How large is the table / relation? • Row count (500 million) and size on disk (10 TB) 3. Physical resources (cpu, memory)? • Memory is the usual concern
comments.post_id = ? ORDER BY comments.date_posted DESC CREATE INDEX my_index_name ( comments.post_id, comments.date_posted ) ON comments Note: date_posted has high cardinality, which means this index is more expensive to maintain
Schema Column Type id INT author_id INT (FOREIGN KEY on users) post_id INT (FOREIGN KEY on posts) latest_revision_id INT (FOREIGN KEY on revisions) date_posted DATETIME Revisions
2. Replicate tables to new database server 3. Update application code to remove relations (common in frameworks like Django) 4. [some magic or downtime to cutover databases] General process to split off relations
referencing the relation incorrectly • Write code to handle comments with missing posts • Write code to delete comments when posts go away 1. Remove newly invalid foreign key constraints 2. Replicate tables to new database server 3. Update application code to remove relations (common in frameworks like Django) 4. [some magic or downtime to cutover databases] General process to split off relations Some other things we almost certainly didn’t think about
Setup an event stream (Kafka) 3. Break apart your monolithic MySQL database 4. Write a service which owns one set of your problems 5. Attempt [and fail] to setup automated testing 6. Create a new way to deploy code 7. … 8. Profit!
ownership and autonomy • Improve reliability (through stronger API contracts, reduced complexity) • Transition away from legacy hard-to-support systems
performance People often overlook how slow database access is and it’s easy to fix! • Profile your test suite! • You don’t have to run every test, every time • Use transactions to create quick database tests https://github.com/getsentry/zeus/blob/5004a6b7c538fada3e98c8943ea5385234a8220b/zeus/testutils/pytest.py#L89 • Replace production services with no-ops where possible https://docs.djangoproject.com/en/2.1/topics/cache/#dummy-caching-for-development • Mock third party network calls https://github.com/getsentry/responses
You can deploy service-isolated copies of your monolith REPO 1 sentry.io REPO 1 api.sentry.io REPO 2 docs.sentry.io REPO 1 ingest.sentry.io https://help.github.com/articles/about-codeowners/ lead Docs Team (and put teams on-call for their services)
you, its probably wasting time (and money) • Enable people to do their best work Treat people as adults and give them the tools they need to succeed • Take off your engineering hat Focus on the business goals - less on academics • “Time to Ship” is your metric The faster you can change and react to your customers, the more fun and success your going to enjoy