Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Build your Cloud Operating Model on Azure from ...

Build your Cloud Operating Model on Azure from zero to hero

In this session we will explore how organizations can establish a working cloud operating model in Azure that will help them keep control but also enable agility for their teams, so together they can deliver value to the business. The session is targeting DevOps and PlatformOps teams. Certain level of knowledge of Azure is expected (like Resource Manager, RBAC, Policies, Azure Monitor).
We will explore some new capabilities like Azure Blueprints and Resource Graph and how can you leverage them and other essential services like Security Center, Service Health, and Log Analytics to build the model, gain insights into your day-to-day operations, collect telemetry you need, automate some key processes using serverless components and integrate your favorite tools (like Slack, GitHub, etc.). By the end of this demo-packed session we should have a working model the participants can fork from GitHub, customize to fit their needs, and apply in their environment.

David Pazdera

April 05, 2021
Tweet

More Decks by David Pazdera

Other Decks in Technology

Transcript

  1. What is Cloud Operating Model? Is it just about governance?

    Is it only for large enterprises with heavy on-prem footprint? Is it just for migration scenarios (datacenter transformation)?
  2. Traditional approach Block Dev/Ops from directly accessing the cloud (portal/API/cli)

    to attain control Developers Operations Cloud Custodian / Engineers responsible for Cloud environment
  3. Cloud Custodian Team SPEED + CONTROL Cloud-native governance -> removing

    barriers to compliance and enabling velocity Developers Operations Management Groups Templates RBAC Blueprints Policies Policy
  4. Azure Active Directory • EA enrollment account • PIM •

    Service Principals • Security groups Management Groups • Hierarchy • Naming conventions • Tagging • Policy definitions Azure DevOps • Subscription provisioning • Deployment pipeline • Role & Policy def. and assign. • Source control & branching Management subscription • Automation accounts • Workspaces (Azure Monitor) • Archetype templates • Role definitions Shared Services • Hub VNets (UDRs, NSGs) • Azure Firewall, Express Route • Security Center, Key Vault • AD Domain Services (ext.) Application subscriptions • Spoke VNets (UDR, NSG, peering to Hub) • Security Center Key design areas
  5. What worked | key learnings > Define requirements for compliance

    > Design Hierarchy & Subscription Modeling > Apply top-level controls: policies + access control/RBAC > Stamp out standardized cloud environment with Blueprints > Use subscription as a unit of scale for App teams (but also offer a “smaller” unit of delivery)
  6. What worked | key learnings > Define guardrail requirements for

    compliance leverage built-ins like ISO 27001 blueprint & policy initiative > Design Hierarchy & Subscription Modeling > Apply top-level controls: policies + access control/RBAC > Stamp out standardized cloud environment with Blueprints > Use subscription as a unit of scale for App teams (but also offer a “smaller” unit of delivery)
  7. Subscription modeling strategy App A Pre-Prod App B Pre-Prod Shared

    services (Pre-Prod) App C Pre-Prod App A Prod App B Prod Shared services (Prod) App C Prod Prod RBAC + Policy Pre-Prod RBAC + Policy Org Management Group Subscription Management Group
  8. Management Groups principles > We defined the hierarchy based on

    organization and environment type (Corporate, Standalone, etc.) > The root MG is reserved for global configuration > Assigned common policies and RBAC higher up in the hierarchy
  9. What worked | key learnings > Define guardrail requirements for

    compliance leverage built-ins like ISO 27001 blueprint & policy initiative > Design Hierarchy & Subscription Modeling > Apply top-level controls: policies + access control/RBAC > Stamp out standardized cloud environment with Blueprints > Use subscription as a unit of scale for App teams (but also offer a “smaller” unit of delivery)
  10. Policy key info > Real-time policy enforcement > At-scale compliance

    assessment > Policy evaluates all Azure resources & in- guest VM > Policy generate compliance events that can be used for alerting (Activity Log) > Aggregated and raw compliance data are available through API, PowerShell & CLI > Can be used to remediate problems in your environment
  11. What worked for us > Started with Audit Policies, a

    safe way of understanding what a policy will do > Resource type whitelisting > Tested Deny policies in non- prod to understand impact > DeployIfNotExist Policy vs. fix the source code dilemma
  12. RBAC principles > principle of granting the least privilege required

    to do the expected work > Just-in-time access using Privileged Identity Management > Inherited to all children of the assigned scope > utilize Managed Identities where you can
  13. What worked | key learnings > Define guardrail requirements for

    compliance leverage built-ins like ISO 27001 blueprint & policy initiative > Design Hierarchy & Subscription Modeling > Apply top-level controls: policies + access control/RBAC > Stamp out standardized cloud environment with Blueprints > Use subscription as a unit of scale for App teams (but also offer a “smaller” unit of delivery)
  14. What worked | key learnings > Define guardrail requirements for

    compliance leverage built-ins like ISO 27001 blueprint & policy initiative > Design Hierarchy & Subscription Modeling > Apply top-level controls: policies + access control/RBAC > Stamp out standardized cloud environment with Blueprints > Use subscription as a unit of scale for App teams (but also offer a “smaller” unit of delivery)
  15. What worked | key learnings > Periodically audit the overall

    compliance of your environment in Azure Policy Compliance view and Azure Security Center > Monitor the platform health using Service Health > Resource Graph is much faster to query across the entire environment
  16. What worked | key learnings > Periodically audit the overall

    compliance of your environment in Azure Policy Compliance view and Azure Security Center > Monitor the platform health using Service Health > Resource Graph is much faster to query across the entire environment
  17. What worked | key learnings > Periodically audit the overall

    compliance of your environment in Azure Policy Compliance view and Azure Security Center > Monitor the platform health using Service Health > Resource Graph is much faster to query across the entire environment
  18. Resource Graph in action > Explorer - build rich dashboards

    (pin queries and graphs) > Query all resources across RGs, MGs, subscriptions > Write queries using Azure Resource Graph Query Language (based on KQL)
  19. What worked | roadmap > Provide your customers with a

    feedback channel for reporting bugs and requesting new features or changes > Introduce more testing in the pipeline (Pester) > End-user documentation (wiki) & support
  20. Blob Storage Template file Parameter file Key Vault Automation Runbook

    (Add-ADDomainMembership) Activity Log Alert Azure Activity Log Action Group Logic App (workflow) Components in every (sandbox) subscription Components in central management subscription AD Domain Join Service (V2) AD domain controllers Hybrid Runbook Worker
  21. Friends don't let friends right-click publish. Donovan Brown Friends don't

    let friends provision production cloud resources via GUI. David Pazdera Everything as Code deployed through a pipeline
  22. People and culture are the key > transformation of (IT)

    organization → cloud mindset > common set of KPIs and incentives > (non-)technical skills and learning plan > centralized vs. decentralized model > goal: "Cloud-ready IT organization that is able to build and run (operate) production-ready Azure environment."
  23. What is Cloud Operating Model? Is it just about governance?

    Is it only for large enterprises with heavy on-prem footprint? Is it just for migration scenarios (datacenter transformation)?