Meetup Camptocamp: Exoscale SKS

Scalable Kubernetes Service 03-2021

Outline • Intro • Software at Exoscale • Kubernetes at
Exoscale • Challenges met • SKS: Scalable Kubernetes Service

Intro and context

Exoscale in a nutshell • Infrastructure as a Service, 6
zones throughout Europe • Now part of A1 group • Public cloud in Geneva since 2013

The product

Software at Exoscale

What’s in a cloud provider? • Datacenter & Network Operations
• Security • Automation • DBA • Software development

The software we write • Object Storage controller • Internal
SDN • Compute orchestrator • Load-balancer orchestrator • Kubernetes orchestrator • Web portal • Customer Management • Usage Metering • Billing • Integration tooling (CLI, terraform provider, …) • Command and control, automation support

Things that didn’t exist in 2012 • Ansible • Terraform
• Docker • Kubernetes • Wiﬁ • Television

Initial stack • Puppet for conﬁguration management, in-house command and
control • 5 large external facing services, databases, a number of batch processing tools • VM proﬁles per role, horizontal scaling where possible

Why container orchestration then? • Puppet becomes a hot spot
of activity ◦ Hard to convey the entire infrastructure need of an application in one place ◦ Configuration scattered across different places (load-balancing, firewalling, software, monitoring) • Always making allocation decisions “on what class of machines should this run?” • Overall low utilization, but contention during peaks! • Large MTTR for failed nodes

Kubernetes at Exoscale

Initial exploration • Strong interest in Apache Mesos (not tied
to docker, distributed systems building toolbox) • Witnessed Kubernetes fast adoption • Swarm and nomad didn’t ﬁt the bill for a number of reasons

Going for Kubernetes • Traction • The kicker were the
open-ended abstractions: Service, Ingress, CRI, CNI, CSI ◦ These allow providers to step in and provide a best in class implementation of the abstraction ◦ The abstraction allows for a much better shot at expressing infrastructure independent from the location • We decided to start with our API gateway ◦ One of the most active projects at the time ◦ Extremely sensitive to disruption

Challenges met

Keeping our promises in a containerized world • Conﬁg management
◦ Now next to the application: huge progress ◦ Added internal tooling to generate manifests • Deployments ◦ Registries vs. Debian repositories ◦ ArgoCD for managing deployments • Security ◦ Network and security policies ◦ OPA (wip)

Container networking • Network used to be boring ◦ A
public IP per VM ◦ Security groups to provide isolation • Exoscale private networks not ready for CNI • Performance analysis led to the use of Calico

SKS: Scalable Kubernetes Service

Redux of what we learnt • Network ◦ Calico •
Security ◦ Several certiﬁcate authorities per cluster ◦ Encryption key for secrets, per cluster ◦ Wireguard available on the template ◦ Cluster access using certiﬁcates (support for users, groups, TTL) • Exoscale Cloud Controller Manager ◦ To validate worker nodes ◦ Network Load Balancer integration

Full integration in the Exoscale stack • Network Load Balancer
◦ “LoadBalancer” Kubernetes services ◦ Conﬁguration using annotations • Instance Pools ◦ We rely on instance pools for the nodepools ◦ Same properties (nodes cycling…) • Security groups (per nodepool) • Anti aﬃnity groups (per nodepool) • API and tooling ◦ CLI, Terraform

Product objectives • Speed ◦ Create clusters in ~100 seconds
◦ New nodes in the cluster in ~120 seconds (available in “kubectl get nodes”) ◦ Should be faster in the future • Seamless start • CNCF compliance • Reliability: two oﬀerings ◦ starter: no SLA, non-HA control plane, free ◦ pro: SLA, HA control plane

Kubernete dashboard Kubernetes “LoadBalancer” Service Exoscale Load Balancer Kubernetes Cluster
Outside world

Kubernete dashboard Kubernetes “LoadBalancer” Service Exoscale Load Balancer Kubernetes Cluster
Outside world Kubernetes “LoadBalancer” Service Nginx ingress controller App App Exoscale Load Balancer

Additional notes and future work

Advanced use cases • Cluster lifecycle management ◦ Cluster upgrades
(next patchs, next minor) • Certificate management ◦ You can retrieve various CA certificates in order to configure some components • Multiple nodepools ◦ Each nodepool is independant ◦ Can have different disk sizes, offerings, anti affinity groups, networking rules… ◦ Can be scaled independently

Ongoing work • Cluster autoscaler (short-term) ◦ Automatically scale nodepools
based on Kubernetes metrics • Web portal (short-term) • Blueprints (short-term) ◦ Manifests examples for common things • GPU nodepools • More add-ons: dashboard, ingress, metrics-server ◦ metrics-server should arrive soon • Persistent volumes: speciﬁc add-on • Automatic security group management • Managed container registry • Advanced IAM integration

Meetup Camptocamp: Exoscale SKS

Meetup Camptocamp: Exoscale SKS

Pierre-Yves Ritschard

More Decks by Pierre-Yves Ritschard

Other Decks in Technology

Featured

Transcript

Scalable Kubernetes Service 03-2021

Outline • Intro • Software at Exoscale • Kubernetes at

Intro and context

Exoscale in a nutshell • Infrastructure as a Service, 6

The product

Software at Exoscale

What’s in a cloud provider? • Datacenter & Network Operations

The software we write • Object Storage controller • Internal

Things that didn’t exist in 2012 • Ansible • Terraform

Initial stack • Puppet for conﬁguration management, in-house command and

Why container orchestration then? • Puppet becomes a hot spot

Kubernetes at Exoscale

Initial exploration • Strong interest in Apache Mesos (not tied

Going for Kubernetes • Traction • The kicker were the

Challenges met

Keeping our promises in a containerized world • Conﬁg management

Container networking • Network used to be boring ◦ A

SKS: Scalable Kubernetes Service

Redux of what we learnt • Network ◦ Calico •

Full integration in the Exoscale stack • Network Load Balancer

Product objectives • Speed ◦ Create clusters in ~100 seconds

Demo!

Kubernete dashboard Kubernetes “LoadBalancer” Service Exoscale Load Balancer Kubernetes Cluster

Kubernete dashboard Kubernetes “LoadBalancer” Service Exoscale Load Balancer Kubernetes Cluster

Additional notes and future work

Advanced use cases • Cluster lifecycle management ◦ Cluster upgrades

Ongoing work • Cluster autoscaler (short-term) ◦ Automatically scale nodepools