$30 off During Our Annual Pro Sale. View Details »

Case Studies: Modern Development Practices In Highly Regulated Environments

Charity Majors
September 28, 2023

Case Studies: Modern Development Practices In Highly Regulated Environments

I gave a talk at Fintech Devcon in August 2023 on why "Modern Development Best Practices Are Not Incompatible With Highly Regulated Environments." Since then I have been compiling a list of case studies of companies in regulated domains (health care, security, etc) who are practicing Continuous Deployment, observability-driven development, and other modern practices that combine to radically accelerate delivery via fast feedback loops. If you would like to add your company to the list, please get in touch!! Contact info on the last slide.

Charity Majors

September 28, 2023
Tweet

More Decks by Charity Majors

Other Decks in Technology

Transcript

  1. @mipsytipsy
    Modern Development Best
    Practices in Highly Regulated
    Environments: Case Studies
    “How We Did It”

    View Slide

  2. Modern software development practices
    1.Engineers owning their own code in production


    2.Practicing observability-driven development


    3.Testing in production


    4.Separating deploys from releases using feature flags


    5.Continuous deployment (or at least delivery)

    View Slide

  3. Getting your code into production as fast
    as possible after writing it.
    FAST FEEDBACK LOOPS
    Modern software development practices
    are ✨ALL✨ about

    View Slide

  4. “Explain it to me like I’m five”:
    Regulations: you are subject to these if you operate under
    their domain, e.g. GDPR, CCPA, HIPAA, PCI/DSS, etc
    ✨Security✨
    Frameworks: you may be audited to ensure you conform to
    these, e.g. SOC2, ISO 27001, NIST, FedRAMP etc
    Your security team has written policies for compliance with
    these, and your legal team signs contracts with customers.

    View Slide

  5. Frameworks & regulations are not prescriptive.


    None of them forbid any modern development practices.
    However, these practices MAY conflict with
    your own written policies.
    They might also conflict with terms in
    your own customer contracts.

    View Slide

  6. Policies are living documents.


    They should be subject to regular review and reconsideration.
    But!
    Do your security and legal teams know
    when to push back or loop you in?
    Contracts should be negotiated, not just signed.
    Engineering should have a say.

    View Slide

  7. “We’re a regulated industry. Therefore…”
    ❌ We can’t let developers deploy their own code due to segregation of duties


    ❌ All changes must be approved by a Change Advisory Board


    ❌ Trunk based development is not allowed


    ❌ No testing in production, or developer access to production


    ❌ You cannot log anything ❌ You must log everything, and cannot delete anything


    ❌ You are not allowed to use any SaaS, or multi tenant databases or compute


    ❌ You are not permitted to refactor your code


    ❌ Manual testing must occur before each deploy


    ❌ Auto-deploying your code is not permissible ❌ Auto-deploying is mandatory
    For more, see this thread ➡ ➡ ➡ https://twitter.com/mipsytipsy/status/1694163770753601887
    How many times have you heard:

    View Slide

  8. ✨Bullshit.✨
    All of that is
    Stand by for proof: a long list of case studies of companies who are auto-deploying,
    developing off trunk, getting code into production in a matter of minutes, etc.
    All of them are subject to the same regulations you are.


    Some of them may be your competitors.

    View Slide

  9. How Etsy did it (in 2013!!):
    • Decouple the cardholder data and PCI/DSS regulations from the
    rest of the system


    • The systems that form the cardholder data environment (CDE) are
    separated from the rest of Etsy’s environments at the physical,
    network, source code, and logical infra levels


    • The CDE is built and operated by an xfn team that is solely
    responsible for the CDE. Again, this limits the scope of the PCI
    DSS regulations to just this team.
    https://queue.acm.org/detail.cfm?id=3190610

    View Slide

  10. How Honeycomb does it:
    • Subject to privacy laws such as GDPR, CCPA, HIPAA (BAA)


    • Security framework adapted to SOC2 trust services criteria
    (confidentiality and security


    • Auto-deploys once an hour off trunk via a cron job. Extensive
    investment into tests. Takes about an hour for code to go live.


    • Practices trunk-based dev, short-lived branches, code reviews


    • Access Management policy based on least privilege model.
    Access to PII/prod data is limited to those with a business need.

    View Slide

  11. How Branch Insurance does it:
    • Regulated by 36 states and DC, annual SOC2s


    • Production data and envs mostly isolated from most engineers;
    only TLs can analyze production telemetry for PII purposes
    (despite masking and filtering and tokenizing)


    • Every developer has their own AWS account, massive investment
    in testing. Trunk-based development.


    • Uses serverless extensively; pushes to trunk many times/day,
    pushes to prod many times/week, in under an hour end to end.

    View Slide

  12. How Stytch does it:
    • Certified ISO27001 and SOC2 Type 2, subject to GDPR, CCPA


    • Auto-deploys on PR merge with an average of 13 min before
    code goes live, approximately 30 times/week


    • Trunk-based development with optional on-demand preview
    environments for PRs. Extensive integration testing before merge!


    • Data access granted to people who need it for their jobs, with
    data auditing and masking to further ensure user privacy

    View Slide

  13. How Entrata does it:
    • Subject to A LOT of compliance audits, including PCI-DSS


    • Keeps PCI environment isolated on a separate private network,
    AWS account, GitHub org, etc. PCI codebase has no external
    deps, can be tested in isolation. Owned by a single eng team.


    • Can deploy a line of PCI-compliant code to production in 15 min


    • Code review before merging to main, then test on staging, cut a
    release to production branch, deploy to prod. Access to db, app
    servers is extremely limited.


    • 20 year old company; code originally written w/o unit tests

    View Slide

  14. How Ocado Technology does it:
    • Certified SOC1, SOC2, PCI/DSS; also subject to GDPR


    • Hundreds of apps in production, owned by ~200 teams


    • On average, code gets deployed to production every 3 minutes


    • Takes ~1 hour for code to get to production after a merge.
    Practices canary + rolling deploys over the course of 4-5 days.


    • Data access granted to people who need it for their jobs, with
    data auditing and masking to further ensure user privacy
    https://handbook.ocado.tech/#/sw-development/technical-standards?id=encryption-of-personal-data
    https://handbook.ocado.tech/#/sw-development/hallmarks


    https://handbook.ocado.tech/#/sw-development/maturity-model

    View Slide

  15. How ClarityAI does it:
    • Certified ISO27001 and SOC2 Type 2, practiced a joint audit
    strategy to streamline time and resources


    • Some teams practice Continuous Deployment and deploy several
    times per day using trunk-based development, TDD, and pairing


    • Other teams deploy at least once per day using short-lived
    branches
    https://medium.com/clarityai-engineering/iso27001-and-soc2-type-ii-from-greenfield-to-success-24ca99decb26

    View Slide

  16. How Bankwest (Perth) does it:
    • Deploys to production within a few hours


    • Worked to get rid of the Change Advisory Board for most uses.
    First defined some types of changes as lower risk to avoid
    Change Approval processes, then worked hard to make almost
    every change fit those lower risk definitions.


    • Feature flags, separating deploys/releases, backwards compatible
    changes, API expand/rollout/contract, small releases deployed
    often, observability in production

    View Slide

  17. How Cabify does it:
    • Practices Continuous Delivery, deploys 1-6 times a day, lead time
    for changes is 35 min


    • Certified PCI/DSS on the payments side, financial audit for the
    entire company


    • Feature flags, separating deploys/releases, backwards compatible
    changes, API expand/rollout/contract, small releases deployed
    often, observability in production

    View Slide

  18. How Ping Identity does it:
    • Certified ISO27001, SOC2 Type 2


    • Took about an hour to deploy


    • Auditors cared about what pipeline did, what gates there were,
    what controls we had.


    • Merge requests required approval from someone not the author,
    tests needed to run and pass, someone needed to approve
    before deployment

    View Slide

  19. How SALTO does it:
    • Certified ISO27001, working on SOC2 Type 2


    • Deploys several times a day


    • No one has access to raw data. If something must be checked
    against databases, it must be 1) requested, 2) approved by a
    manager, 3) run through a system that anonymizes data


    • Practices GitOps (TF, Flux2, k8s) to avoid manually writing to prod


    • Oncall and a few other people have read access to prod

    View Slide

  20. How Duffel does it:
    • PCI L1 compliant


    • Can get a line of code into prod in 30 min (!!!)


    • Deploys from trunk, runs static analysis as part of CI?CD


    • Mandatory PR review approvals from an accepted PCI group,
    which turns into a merge commit after approval.


    • Merge commit SHA is the source of a container image


    • Uses a lot of Security Command Center premium features for
    threat detection, vulnerabilities, time to resolution.

    View Slide

  21. How toplyne.io does it:
    • Certified SOC2 Type 2, subject to GDPR, HIPAA, and CCPA,
    working on ISO27001:2022


    • Trunk-based deployment, manual PR reviews via GitStream


    • All teams deploy multiple times a day, and can deploy one line of
    code in <15 min


    • Platform engineering owns Security and Compliance


    • Multiple tests run for SAST and DAST in CI and during deploys

    View Slide

  22. How AudioStack does it:
    • Certified SOC2 Type 1, subject to GDPR; working on ISO27001
    and SOC2 Type 2


    • All security checks run automatically with GitHub and other tools
    as part of CI/CD architecture


    • Deploy takes about 30 minutes


    • Deploys to prod at least daily, after tests pass and a merge
    request has been reviewed and approved


    • Restricts access to data, least privilege access

    View Slide

  23. How Jack Henry does it:
    • Certified SOC1, SOC2, PCI/DSS; subject to FBA, state banking regs


    • 300+ different applications in K8s. 100+ deploys per day.


    • Column-level encryption on DBs allows devs to have read access
    to prod DBs (✨cool!!✨)


    • Code review before merging to main. Release gets cut and runs
    through user acceptance testing; approvals sent to stakeholders,
    deploy to production kicks off once approved


    • Takes about 30 min for code to get to production after a merge. All
    changes are canaried.

    View Slide

  24. How up.com.au does it:
    • Certified SOC1, SOC2, PCI/DSS; subject to Aussie banking regs


    • Deploys to production around hourly


    • Massively parallel, fully automated test suite spins up a replica of
    production in seconds, uses Rspec and Appium to run thousands
    of tests on every change


    • Takes around 20 min to run the full test suite, then decommissions
    replica. Can turn around changes to prod in minutes


    • Two-speed architecture lets us deploy changes constantly on the
    customer-facing side, and deliberately on the banking side.

    View Slide

  25. Stop blaming regulations and frameworks.
    This is all about how we decide
    to interpret the standards.
    ✨This is not their fault.✨

    View Slide

  26. We are all on the same side.





    This is about better security, too.

    View Slide

  27. We need engineers & leaders who
    understand the existential urgency of a
    short cycle time, and will fight for it.
    Not just once or twice.
    Every day.

    View Slide

  28. View Slide

  29. Hey, you! ✨Hi!✨
    Do YOU work at a company that is subject to regulations and standards, but uses
    modern development best practices (continuous deployment, observability-
    driven development, fast feedback loops, auto-deploys, etc)?
    Would you like to be on a slide? ☺
    DM me on twitter @mipsytipsy or email me at [email protected] and let’s do
    this! 🥰 You don’t have to be “perfect” (no one is). Let’s show the world just how
    doable this is!! ❤🔥 P.S. This is also GREAT for recruiting…just sayin’.

    View Slide

  30. For more, see my slides on


    “Why Compliance And Regulatory Standards Are Not
    Incompatible With Modern Development Best Practices”
    https://speakerdeck.com/charity/compliance-and-regulatory-standards-
    are-not-incompatible-with-modern-development-best-practices
    https://speakerdeck.com/charity
    or just go to:

    View Slide