high performance • easy to customize and extend • “Watch” object change • Decide next step based on state change • not edge driven (event), level driven (state)
LogCollector: read and redirect logs to storage • Request MEM: • App: 1G • LogCollector: 0.5G • Available MEM: • Node_A: 1.25G • Node_B: 2G • What happens if App is scheduled to Node_A first?
Can be taken away very quickly • “Merely” cause slowness when revoked • e.g. CPU • Non-compressible resources • Hold state • Are slower to be taken away • Can fail to be revoked • e.g. Memory, disk space Kubernetes (and Docker) can only handle CPU & Memory Don’t handle things like memory bandwidth, disk time, cache, network bandwidth, ... (yet)
be used, with a strong guarantee of availability • CPU (seconds/second), RAM (bytes) • Scheduler will not over-commit requests • Limit: max amount of a resource that can be used, regardless of guarantees • scheduler ignores limits • Mapping to Docker • —cpu-shares=requests.cpu • —cpu-quota=limits.cpu • —cpu-period=100ms • —memory=limits.memory
for all resources, all containers • limits == requests (if set) • Be killed until they exceed their limits • or if the system is under memory pressure and there are no lower priority containers that can be killed. • Burstable • requests is set for one or more resources, one or more containers • limits (if set) != requests • killed once they exceed their requests and no Best-Effort pods exist when system under memory pressure • Best-Effort • requests and limits are not set for all of the resources, all containers • First to get killed if the system runs out of memory
Set and Pods. • Check the status of a Deployment. • Update that Deployment (e.g. new image, labels). • Rollback to an earlier Deployment revision. • Pause and resume a Deployment.
• kubectl edit • open an editor and modify your deployment yaml • RollingUpdateStrategy • 1 max unavailable • 1 max surge • can also be percentage • Does not kill old Pods until a sufficient number of new Pods have come up • Does not create new Pods until a sufficient number of old Pods have been killed. trigger
Controller • Create: Replica Set (nginx-deployment-2035384211) and scaled it up to 3 replicas directly. • Update: • created a new Replica Set (nginx-deployment-1564180365) and scaled it up to 1 • scaled down the old Replica Set to 2 • continued scaling up and down the new and the old Replica Set, with the same rolling update strategy. • Finally, 3 available replicas in the new Replica Set, and the old Replica Set is scaled down to 0.
Name of metric • Type (Counter, Gauge, ...) • Data Type (int, float) • Units (kbps, seconds, count) • Polling Frequency • Regexps (Regular expressions to specify which metrics to collect and how to parse them) • The metric will be added to pod as ConfigMap volume Prometheus Nginx
or volume • The pod’s name • The pod’s namespace • The pod’s IP • A container’s cpu limit • A container’s cpu request • A container’s memory limit • A container’s memory request
• IPs route to one or more cluster nodes (e.g. floating IP) • Use external LoadBalancer • Require support from IaaS (GCE, AWS, OpenStack) • Deploy a service-loadbalancer (e.g. HAproxy) • Official guide: https://github.com/kubernetes/contrib/tree/master/service-loadbalancer
Deployed as a Pod on dedicated Node (with external network) • Implementation • Nginx, HAproxy, GCE L7 • External access for service • SSL support for service • … s1 http://foo.bar.com <IP_of_Ingress_node> http://foo.bar.com/foo
Stable hostname • Stable storage • linked to the ordinal & hostname • Databases like MySQL or PostgreSQL • single instance attached to a persistent volume at any time • Clustered software like Zookeeper, Etcd, or Elasticsearch, Cassandra • stable membership. Update StatefulSet: Scale: create/delete one by one Scale in: will not delete old persistent volume
affiliate containers • Not all containers need independent network • Network implementation for pod is totally the same as for single container Pod Infra container Container A Container B --net=container:pause /proc/{pid}/ns/net -> net:[4026532483]
etc • The kubelet cni flags: • --network-plugin=cni • --network-plugin-dir=/etc/cni/net.d • CNI is very simple 1.Kubelet creates a network namespace for Pod 2.Kubelet invokes CNI plugin to configure the NS (interface name, IP, MAC, gateway, bridge name …) 3.Infra container in Pod join this network namespace
come from? • A: Kubernetes = “Borg” + “Container” • Kubernetes is a set of methodology for using containers based on past 10+ yr’s exp in Google Inc. • “不不要摸着⽯石头过河” • Kubernetes is a container centric DevOps/Workload orchestration system • Not a “CI/CD”, “Micro-service” focused container cloud