Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Oops I deployed too hard

Matteo Bianchi
April 25, 2024
2

Oops I deployed too hard

Sometimes you delete production, you drop tables, you change a conf and everything breaks but what if you upgraded the wrong environment, to the wrong version, of the wrong customer (that also happens to be the bigger your company has)?

And what if you did it right before going to a long lunch?

This is a tale of compliance, regulations, unauthorized deployments and ethical questions!
Solved more through diplomacy than with coding.
Memes included*

Matteo Bianchi

April 25, 2024
Tweet

Transcript

  1. Welcome! I'm Matteo, you can find me as @mbianchidev on

    social media. I used to be a DevOps like you but then I took an arrow in the knee… and became a DevRel.
  2. The BIG RED BUTTON The BIG RED BUTTON - A

    serious recap - The big OOPS 03
  3. Deploy Workflow We used gitflow branching, on release/x.y.z Commit Jenkins

    tagging the artifact with x.y.z Build Puppet conf where we manually insert the new artifact x.y.z IaC* Apply and restart the JBoss process logging in the VM Deploy Declare the deployment on Slack Communicate *note: VMs were provisioned in the old manual old fashioned way 🤠(yee haw) but configured with IaC
  4. https://just-another-day-on-the-job.it/page/1 11:10 AM - Lead: Hey, can you deploy v4.20

    on Acme Inc. test before lunch? 11:15 AM - Me: Sure thing, let me just fix something real quick.
  5. https://how-did-it-went-1.sh mbianchi@CompanyPC:~$ # here I hit reverse history search (reverse-i-search)`':

    (reverse-i-search)`dep': deploy.sh prod acme mbianchi@CompanyPC:~$ # here I hit send and walk away mbianchi@CompanyPC:~$ clear
  6. https://just-another-day-on-the-job.it/page/2 11:10 AM - Lead: Hey, can you deploy v4.20

    on Acme Inc. test before lunch? 11:15 AM - Me: Sure thing, let me just fix something real quick. 12:40 PM - Me: Pipeline succeeded, artifact is available, I started the script, grabbing lunch now, brb.
  7. https://how-did-it-went-2.sh mbianchi@CompanyPC:~$ ... mbianchi@CompanyPC:~$ # artifact gets downloaded mbianchi@CompanyPC:~$ #

    checksum gets checked mbianchi@CompanyPC:~$ # IaC file gets AWKed, old version is overwritten with the new shiny v4.20 mbianchi@CompanyPC:~$ # puppet_apply.sh runs mbianchi@CompanyPC:~$ # Deployment successful!
  8. https://just-another-day-on-the-job.it/page/3 11:10 AM - Lead: Hey, can you deploy v4.20

    on Acme Inc. test before lunch? 11:15 AM - Me: Sure thing, let me just fix something real quick. 12:40 PM - Me: Pipeline succeeded, artifact is available, I started the script, grabbing lunch now, brb. 13:15 PM - Lead: Hey why is the client saying nothing works anymore? 13:17 PM - Lead: ??? 13:22 PM - Lead: HELLOOO???? We have a problem!! 13:31 PM - Project Manager: Everyone into the war room… NOW! Ok but this war room… is it a physical war room?
  9. https://oh-no.law I’ll automate my manual job with a script having

    0 guardrails So I can save time for my precious friday pizza lunch I’ll get too confident and do silly mistakes I’ll get too confident and do silly mistakes potentially causing a €€€ corp lawsuit
  10. https://how-did-it-went-1.sh mbianchi@CompanyPC:~$ # here I hit reverse history search (reverse-i-search)`':

    (reverse-i-search)`dep': deploy.sh prod acme mbianchi@CompanyPC:~$ Are you really sure you want to deploy on PRODUCTION of ACME? Confirm by writing “yes”. mbianchi@CompanyPC:~$ No. mbianchi@CompanyPC:~$ No, ‘cause we actually migrated EVERYTHING on a shiny GitOps - Kubernetes setup after 5 months from this happening.
  11. Key TAKEAWAYS 01 02 03 Automation should be trusted for

    prod as well (with guardrails) Mistakes will happen, risk management is not optional Rules should be challenged