Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[SCaLE16x] Silo-Based Architectures for High Av...

[SCaLE16x] Silo-Based Architectures for High Availability Applications

High availability is becoming a de-facto requirement of today's applications. Customer-facing IT failures mean directly losing customer revenue and trust, as users have grown accustomed to easily switching service providers for more reliable ones. Lack of internal systems availability block employee productivity and add to the financial burden. Thus, it is critical to have a healthy, performant, resilient IT structure serving as a backbone of conducting your business. But there are no textbook solutions to achieving five 9s availability. Data redundancy, computing clusters, load balancing, fail-over mechanisms, each of these individually addresses one potential issue, but none treats systems in your organisation holistically for maximising business revenue.

Not everyone has the financial and technical ability to use the latest and greatest CDN and offload their high-availability requirements to such 3rd parties. This is where smartness comes into play, and my goal is to show you a different way of architecting an application, one that is centered around solving your own business needs without a huge additional cost. We have devised this solution while working on a very large US airline, using open-source technologies, to meed the Black Friday & Cyber Monday traffic requirements.

Silos are a clever method of grouping servers in such a way that they can be scaled both horizontally and vertically, depending on the actual application needs. Most importantly, it frees you from over-optimizing the architecture upfront, by allowing fine adjustments easy to integrate in your Agile workflow.

Georgiana Gligor

March 10, 2018
Tweet

More Decks by Georgiana Gligor

Other Decks in Technology

Transcript

  1. @gbtekkie SCaLE 16X 2 ✤ Geek. Mother. Do-er. ✤ on

    LAMP/LEMP stack since 2003 ✤ Architecture / DevOps consultant ✤ RomaniaPHP Organizer ✤ PhD Student @gbtekkie [email protected] GEORGIANA GLIGOR
  2. @gbtekkie SCaLE 16X 3 advantages and disadvantages silos: a possible

    approach the need for high availability what is high availability (HA)? AGENDA
  3. @gbtekkie SCaLE 16X master Frontend Business Logic Frontend Frontend Browser

    internet Load balancer slave reads writes 11 ADJUSTING
  4. @gbtekkie SCaLE 16X master Frontend Business Logic Frontend Frontend Browser

    internet Load balancer slave reads writes 12 ADJUSTING redundancy
  5. @gbtekkie SCaLE 16X master Frontend Business Logic Frontend Frontend Browser

    internet Load balancer slave reads writes 13 ADJUSTING resilience
  6. @gbtekkie SCaLE 16X 17 Ability to access the system: ✤

    retrieve information ✤ alter information ✤ send new data AVAILABILITY
  7. @gbtekkie SCaLE 16X 19 THE 9s DANCE Uptime Downtime (per

    year) 90.000 % 36.50 days one nine 99.000 % 3.65 days two nines 99.900 % 8.76 hrs three nines 99.950 % 4 hrs 23 mins 99.990 % 52.56 mins four nines 99.999 % 5.26 mins five nines
  8. @gbtekkie SCaLE 16X 20 THE 9s DANCE Uptime Downtime (per

    year) 90.000 % 36.50 days 99.000 % 3.65 days 99.900 % 8.76 hrs 99.950 % 4 hrs 23 mins Amazon SLA 99.990 % 52.56 mins four nines 99.999 % 5.26 mins five nines
  9. @gbtekkie SCaLE 16X 22 USER BEHAVIOUR amazon facebook youtube Alexa

    Rank 6 3 2 daily time on site 12:07 mins 19:27 mins 23:44 mins daily pageviews / visitor 11.83 9.38 12.84 bounce rate 21 % 29 % 33 %
  10. @gbtekkie SCaLE 16X 31 HA BEST PRACTICES 1. no single

    points of failure 2. stateless application design 3. automate infrastructure for consistency & reliability 4. clever monitoring and alerting 5. geographically distribute your machines 6. keep spare capacity to meet increasing demand
  11. @gbtekkie SCaLE 16X 35 WHAT IS A SILO? ✤ frontend

    (SPAs, PWAs, etc) ✤ backend (e.g. PHP services) ✤ data (including cache) 1 silo = full setup of servers that deliver the end-to-end functionality
  12. @gbtekkie SCaLE 16X 42 ADVANTAGES ✤ reuse familiar technology ✤

    real A/B testing ✤ no BHUF requirements ✤ no disruption => brand loyalty ✤ lower Total Cost of Ownership ✤ simplify scalability
  13. @gbtekkie SCaLE 16X 43 DISADVANTAGES ✤ needs razor-sharp DevOps team

    ✤ small increase in hardware costs on kick-off ✤ adds complexity to the monitoring layer ✤ reconsider traceability ✤ different bug reproducing and hunting
  14. @gbtekkie SCaLE 16X 45 ✤ build situational awareness with clever

    monitoring ✤ automate outage detection ✤ powerful A/B testing TAKEAWAYS
  15. @gbtekkie SCaLE 16X 46 FURTHER READING ✤ Wikipedia HA page

    ✤ OpenStack’s HA concepts ✤ Merge Hemo report from FDA ✤ USA Presidential Policy Directive 21 ✤ “Beyond Legacy Code” book ✤ TechCrunch’s summary of sites affected by Michael Jackson’s death ✤ Netflix lessons learned after AWS outage ✤ Netflix Chaos Monkey source code ✤ Brian Adler’s talk on “Architecting for High Availability and Multi-Cloud”