Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes Cluster State Management

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.

Kubernetes Cluster State Management

Avatar for Michael Hausenblas

Michael Hausenblas

September 08, 2017
Tweet

More Decks by Michael Hausenblas

Other Decks in Technology

Transcript

  1. $ whoami • Working on distributed systems in the past

    20 years, containers some 4+ years • Web data & big data (research, MapR) • Containers and container orchestrators (Mesosphere, Red Hat) • Developer turned ops: C++, Java, Python, Node.js and since around 2014 a Gopher @mhausenblas
  2. Kubernetes ops/dev use cases • Saving money • Troubleshooting •

    Auditing • Billing • Capacity planning • Upgrading • Restore • Disaster Recovery Read more about use cases here.
  3. Some terms we’ll be using … • State (static vs.

    dynamic) • Artefacts (files, records, etc.) • Levels (system vs. app)
  4. The community view • kubernetes/kubernetes#24229: Backup/migrate cluster? • kubernetes/kubernetes#21582: Kubectl

    needs export and import commands • Two schools of thought: ◦ ‘Replay all from repo’ ◦ ‘Backups are necessary/useful’
  5. Available solutions • Initially, only prod-ready solution was backup &

    restore with etcdctl • kubernetes-incubator/bootkube (control plane) • pieterlange/kube-backup (resource state sync to Git inspired by RANCID) • heptio/ark: disaster recovery utility (cluster resources & persistent volumes) • kaptaind/kaptaind: intra-cluster sync for specific resources • ReShifter (more in a moment)
  6. Cluster state snapshots: levels of abstraction • raw etcd data

    (WAL, log snapshots) → etcdctl backup • etcd API → ReShifter • Kubernetes API server → Heptio Ark, kaptaind
  7. Challenges • Need to take care of app-level backups/restores separately

    • Which system-level cluster state should be recovered? • Multitenancy (for example: OpenShift online) • Disaster Recovery: RTO/RPO • Low-level: encryption, access rights, etc.
  8. What is ReShifter? • A library: github.com/mhausenblas/reshifter/pkg • A CLI

    tool (rcli) • A Web app (K8S deployment + svc + UI)
  9. Where do we go from here? • Review use cases,

    evaluate solutions, provide feedback • Let me know if you’re interested in contributing • Maybe form a Kubernetes Incubator?
  10. Resources • Kubernetes Deep Dive: API Server – Part 2

    https://blog.openshift.com/kubernetes-deep-dive-api-server-part-2/ • ReShifter: Architecture, design considerations and prior art https://github.com/mhausenblas/reshifter/blob/master/docs/architecture.md