Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AzureBootcamp 2024: From Zero to Infrastructure...

AzureBootcamp 2024: From Zero to Infrastructure as Code and AKS at Swiss Life by Andrea Oltean and Jedrzej Lisowski

⭐️ From Zero to Infrastructure as Code and AKS at Swiss Life#
A compelling outlook on Swiss Life’s transition in cloud strategy, shifting from a decentralised to a centralised approach and spotlighting efficient provisioning of Azure resources with Infrastructure as Code.
🙂 ANDREA OLTEAN ⚡️ DevOps Engineer @ Swiss Life
🙂 JEDRZEJ LISOWSKI ⚡️ DevOps Engineer @ Swiss Life

More Decks by Azure Zurich User Group

Other Decks in Technology

Transcript

  1. From Decentralized to Centralized A story about a team which

    grew up over night Chapter 1 How it works Tech Stack The New Way of Working Processes & Standards Chapter 2 Stats for geeks Numbers Numbers Numbers Chapter 3 Let’s get serious AKS Deep Dive Chapter 4 Conclusions Lessons Learnt and Future Plans Chapter 5 1 2 3 4 5 A journey to Infrastructure as Code and AKS
  2. From Decentralized to Centralized A story about a team who

    grew up over night Chapter 1 1 2 3 4 5 A journey to Infrastructure as Code and AKS
  3. From Decentralized to Centralized 3 members Cloud Security, Governance and

    Compliance Cloud Center of Excellence Solutions designed for each stream, not meant to be re-used Limited reusability Different languages and technologies were used for the same objectives Multiple technologies Every AKS cluster was deployed in different ways Different implementations Each stream had their own IT teams and standards Own IT Teams Collaboration was challenging among streams Limited Knowledge Sharing
  4. From Decentralized to Centralized Cloud Center of Excellence has 10

    members (DevOps & DBOps) Compliance, governance, standards, operations and engineering … for the entire Swiss Life public cloud landscape Build a self-service approach for provisioning cloud infrastructure Everything is open, contribution is welcome
  5. A journey to Infrastructure as Code and AKS From Decentralized

    to Centralized A story about a team who grew up over night Chapter 1 1 2 3 4 5 How it works Tech Stack The New Way of Working Processes & Standards Chapter 2
  6. Microsoft Azure Primary cloud platform that hosts workloads across compute,

    database and supporting services Tech Stack How it works Tech Stack Terraform The main Infrastructure as Code language used to manage Azure resources Azure DevOps Code is stored in Azure Repos Deployments are handled with Azure Pipelines GitHub Co-Pilot A friend in need VS Code Pets Extension
  7. How it works The New Way of Working Live Infrastructure

    Terraform code mapping of real Azure resources using modules Live Infrastructure Template Base repository with examples, templates, standards and guidelines for to understand and create live infrastructure Automatic bootstrapping The process that relies on the Live Infrastructure Template to automatically create the skeleton of a live infrastructure repository Terraform Modules Abstract packages for single scoped deployments (Azure Storage Account, Landing Zone Networking etc.) Terraform Module Template Base repository with examples, templates, standards and guidelines to understand and create modules Multi-repo approach Each Terraform Module lies in its own repository
  8. How it works Development Process Trunking & Versioning • Trunk-based

    development with short-lived branches • Squash commits only to master • Tags according to the semantic versioning system Conventions & Standards • Feature branches and repositories naming convention • Specific names and structure for files and local Terraform resources CHANGELOG & README • CHANGELOG.md: The summary of changes, work item link, date and tag version • README.md: Technical details and deviation from standards
  9. How it works Development Process Trunking & Versioning • Trunk-based

    development with short-lived branches • Squash commits only to master • Tags according to the semantic versioning system Conventions & Standards • Feature branches and repositories naming convention • Specific names and structure for files and local Terraform resources Granular Permissions • Each department can contribute to Terraform Modules and their Live Infrastructure repos & pipelines • Reader SPN for terraform plan • Contributor SPN for terraform apply Remote State Files • Each workload is stored in a blob container deployed in the target Azure Subscription • Backend SPN for access to state files Pull Requests • PR templates • Change version tags • Build validations for end-to- end testing • Security involved CHANGELOG & README • CHANGELOG.md: The summary of changes, work item link, date and tag version • README.md: Technical details and deviation from standards
  10. How it works Deployment Process Terraform Plan & Apply Terraform

    Plan Terraform Plan Push Terraform Remotely D Q Pull Request Merge to Master Tag the new version Terraform Apply Terraform Apply Terraform Apply Q P D PR Pipeline Deployment Pipeline P auto
  11. From Decentralized to Centralized A story about a team which

    grew up over night Chapter 1 How it works Tech Stack The New Way of Working Processes & Standards Chapter 2 Stats for geeks Numbers Numbers Numbers Chapter 3 1 2 3 4 5 A journey to Infrastructure as Code and AKS
  12. 4500 Pipeline runs 500 Pull Requests Stats for geeks 70

    Subscriptions > 100 Repositories 60000 Lines of code > 200 Pipelines 6000 Azure Services
  13. From Decentralized to Centralized A story about a team which

    grew up over night Chapter 1 How it works Tech stack The New Way of Working Processes & Standards Chapter 2 Stats for geeks Numbers Numbers Numbers Chapter 3 Let’s get serious AKS Deep Dive Chapter 4 1 2 3 4 5 A journey to Infrastructure as Code and AKS
  14. Simplified deployments and operations One reusable end to end solution

    contained within the AKS Terraform Module Stateless workloads Stateless workloads can easily survive cluster delete and recreation operations Separate AKS clusters for non-standards 3rd part software or deployments which require further isolation are deployed on separate AKS clusters Linux Containers only All system and user node pools are built on Linux One internal shared AKS Cluster Our offer includes a single AKS cluster for all applications across the organization with isolation achieved with namespaces and network policies. Further isolation achieved with separate node pools GitOps Deployment Model A standard way to deploy infrastructure and applications using ArgoCD AKS Environment Setup
  15. ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- HCL 47 76

    27 2120 YAML 15 10 2 737 Markdown 6 77 6 726 ------------------------------------------------------------------------------- SUM: 68 163 35 3583 ------------------------------------------------------------------------------- AKS Environment Terraform AKS Module
  16. AKS Environment Terraform AKS Live Infrastructure ------------------------------------------------------------------------------- Language files blank

    comment code ------------------------------------------------------------------------------- HCL 29 17 27 655 YAML 2 16 9 210 Markdown 1 1 0 5 ------------------------------------------------------------------------------- SUM: 32 34 36 870 -------------------------------------------------------------------------------
  17. Entra ID Authenitcation for cluster RBAC but also Workload identities

    to authorize pods with external services Fully Private Cluster For non-production clusters, SPOT instances are being used Saving costs with Spot instances The cluster is always ZRS Minimum 3 nodes in a pool “Stateless” Nodes Nodes with Ephemeral drives Monitoring the cluster with built-in Container Insights Monitoring with Container Insights Calico Network Policy Plugin Azure CNI Overlay AKS Deep Dive Cluster Setup
  18. AKS Deep Dive Let’s go deeper ArgoCD Optional, enabled by

    default kured Required cert-manager Optional, disabled by default Certificates Required kyverno Optional, enabled by default Default Network Policy Optional, enabled by default sealed-secrets Optional, enabled by default ingress-nginx Required storage-class Required keda Optional, enabled by default velero Required
  19. AKS Deep Dive Application deployment process • ArgoCD monitors state

    repository • ArgoCD initiates deployment of current state • Application Release trigger with tag • Build Server builds the code und publishes the image • Encrypt Configuration • Publish values-app.yaml and application- definition.yaml in into ArgoCD state repository • Deployment of Namespaces, Managed Identity and Default Network Policy • Additional Resources • Review and Approval of Infrastructure changes by CCoE • Publish values-infra.yaml into ArgoCD state repo Terraform Workflow Developer Workflow ArgoCD Workflow
  20. From Decentralized to Centralized A story about a team which

    grew up over night Chapter 1 How it works Tech Stack The New Way of Working Processes & Standards Chapter 2 Stats for geeks Numbers Numbers Numbers Chapter 3 Let’s get serious AKS Deep Dive Chapter 4 Conclusions Lessons Learnt and Future Plans Chapter 5 1 2 3 4 5 A journey to Infrastructure as Code and AKS
  21. No solution fits them all Standards are efficient but hard

    to maintain Vicious cycle – Ops vs Engineering Do not underestimate the amount of time you will spend on Development GitHub Co-Pilot is very helpful, don’t trust it Do not blindly follow the recommendations Conclusions Lessons Learnt
  22. Automatic Terraform Modules updates Automatic testing with terraform test Conclusions

    Future Plans Designing solutions AKS Enhancements Have whole Infrastructure landscape coded