Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Simplifying multi-cloud and multi-cluster Kuber...

Liz Rice
April 03, 2024

Simplifying multi-cloud and multi-cluster Kubernetes deployments with Cilium

Multi-cloud, multi-cluster Kubernetes deployments are used for high-availability, global distribution, to take advantage of different cloud vendor features, or to use both on-prem and public clouds. But sharing workloads in these distributed environments doesn’t have to be complicated! This talk uses live demos to introduce Cilium’s ClusterMesh capabilities, which make it easy to connect and secure workloads distributed across clouds and clusters.

* Securely connecting multiple Kubernetes clusters
* Distributing services across them
* Load balancing and service affinity
* Applying network policies across multiple clusters
* Exposing distributed services to external traffic

You’ll also learn about the requirements for the underlying internet connectivity between clusters, with an overview of IP address management considerations. You’ll need a basic familiarity with Kubernetes concepts like pods, services, nodes and clusters to get the most out of attending this talk.

Liz Rice

April 03, 2024
Tweet

More Decks by Liz Rice

Other Decks in Technology

Transcript

  1. Simplifying Multi-Cloud and Multi-Cluster Deployments with Cilium Liz Rice |

    @lizrice Chief Open Source Officer, Isovalent Emeritus Chair, CNCF Technical Oversight Committee | CNCF & OpenUK boards
  2. @lizrice Connectivity between clusters Encrypted VPN tunnel(s) Cilium only cares

    whether addresses are reachable Unique IP addresses for nodes and pods (non-overlapping pod CIDRs)
  3. @lizrice Route tables in AWS Addresses in GCP VPC Local

    subnet addresses in AWS VPC Internet
  4. @lizrice Route tables in GCP Local subnet addresses in GCP

    VPC - pod & service CIDRs Addresses in AWS VPC Internet
  5. @lizrice Traffic has to be permitted Firewall rules / security

    groups / network ACLs have to allow traffic from remote cluster
  6. @lizrice eks ❯ cilium config view | grep cluster- cluster-id

    6 cluster-name liz-cm-eks-6 gke ❯ cilium config view | grep cluster- cluster-id 5 cluster-name liz-cm-gke-5 Unique cluster names and IDs
  7. @lizrice eks ❯ cilium config view | grep routing ipv4-native-routing-cidr

    10.0.0.0/8 routing-mode native gke ❯ cilium config view | grep routing ipv4-native-routing-cidr 10.0.0.0/8 routing-mode native Same routing mode
  8. @lizrice ❯ cilium clustermesh enable ❯ cilium status /¯¯\ /¯¯\__/¯¯\

    Cilium: OK \__/¯¯\__/ Operator: OK /¯¯\__/¯¯\ Envoy DaemonSet: disabled (using embedded mode) \__/¯¯\__/ Hubble Relay: OK \__/ ClusterMesh: OK Deployment cilium-operator Desired: 1, Ready: 1/1, Available: 1/1 Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1 DaemonSet cilium Desired: 1, Ready: 1/1, Available: 1/1 Deployment clustermesh-apiserver Desired: 1, Ready: 1/1, Available: 1/1 Containers: cilium Running: 1 cilium-operator Running: 1 hubble-relay Running: 1 clustermesh-apiserver Running: 1 Cluster Pods: 12/12 managed by Cilium ... Enable Cluster Mesh
  9. @lizrice ❯ cilium clustermesh status ✅ Service "clustermesh-apiserver" of type

    "LoadBalancer" found ✅ Cluster access information is available: - 10.5.0.6:2379 ✅ Deployment clustermesh-apiserver is ready 🔌 No cluster connected 🔀 Global services: [ min:0 / avg:0.0 / max:0 ] ❯ ks get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) clustermesh-apiserver LoadBalancer 10.176.9.226 10.5.0.6 2379:31144/TCP clustermesh-apiserver-metrics ClusterIP None <none> 9962/TCP,9964/TCP,9963/TCP Enable Cluster Mesh
  10. @lizrice Connect clusters eks ❯ cilium clustermesh connect --destination-context $CLUSTER5

    ✅ Detected Helm release with Cilium version 1.15.0 ✨ Extracting access information of cluster liz-cm-gke-5... 🔑 Extracting secrets from cluster liz-cm-gke-5... ℹ Found ClusterMesh service IPs: [10.5.0.6] ✨ Extracting access information of cluster liz-cm-eks-6... 🔑 Extracting secrets from cluster liz-cm-eks-6... Hostname based ingress detected, trying to resolve it Hostname resolved, using the found ip(s) ℹ Found ClusterMesh service IPs: [10.6.13.164 10.6.142.91] ℹ Configuring Cilium in cluster 'liz-cm-eks-6.eu-west-1.eksctl.io' to connect to cluster 'gke_cilium-de ℹ Configuring Cilium in cluster 'gke_cilium-demo_europe-west1_liz-cm-gke-5' to connect to cluster 'liz- ✅ Connected cluster liz-cm-eks-6.eu-west-1.eksctl.io and gke_cilium-demo_europe-west1_liz-cm-gke-5!
  11. @lizrice Connected cluster status eks ❯ cilium clustermesh status Hostname

    based ingress detected, trying to resolve it Hostname resolved, using the found ip(s) ✅ Service "clustermesh-apiserver" of type "LoadBalancer" found ✅ Cluster access information is available: - 10.6.142.91:2379 - 10.6.13.164:2379 ✅ Deployment clustermesh-apiserver is ready ✅ All 1 nodes are connected to all clusters [min:1 / avg:1.0 / max:1] 🔌 Cluster Connections: - liz-cm-gke-5: 1/1 configured, 1/1 connected 🔀 Global services: [ min:1 / avg:1.0 / max:1 ] gke ❯ cilium clustermesh status ✅ Service "clustermesh-apiserver" of type "LoadBalancer" found ✅ Cluster access information is available: - 10.5.0.6:2379 ✅ Deployment clustermesh-apiserver is ready ✅ All 1 nodes are connected to all clusters [min:1 / avg:1.0 / max:1] 🔌 Cluster Connections: - liz-cm-eks-6: 1/1 configured, 1/1 connected 🔀 Global services: [ min:1 / avg:1.0 / max:1 ]
  12. @lizrice Global service annotation eks ❯ k describe svc rebel-base

    Name: rebel-base Namespace: farfaraway Labels: <none> Annotations: service.cilium.io/global: true Selector: name=rebel-base Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 172.20.135.230 IPs: 172.20.135.230 Port: <unset> 80/TCP TargetPort: 80/TCP Endpoints: 10.6.1.60:80,10.6.9.41:80 Session Affinity: None Events: <none>
  13. @lizrice Global service annotation gke ❯ for i in {1..10}

    do k exec -it xwing -- curl rebel-base {"Cluster": "EKS-6", "Location", "Dantooine"} {"Cluster": "EKS-6", "Location", "Dantooine"} {"Cluster": "EKS-6", "Location", "Dantooine"} {"Cluster", "GKE-5", "Location": "Alderaan"} {"Cluster": "EKS-6", "Location", "Dantooine"} {"Cluster", "GKE-5", "Location": "Alderaan"} {"Cluster": "EKS-6", "Location", "Dantooine"} {"Cluster", "GKE-5", "Location": "Alderaan"} {"Cluster": "EKS-6", "Location", "Dantooine"} {"Cluster": "EKS-6", "Location", "Dantooine"} {"Cluster": "EKS-6", "Location", "Dantooine"}
  14. @lizrice Global service annotation eks ❯ k describe svc rebel-base

    Name: rebel-base Namespace: farfaraway Labels: <none> Annotations: service.cilium.io/global: true Selector: name=rebel-base Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 172.20.135.230 IPs: 172.20.135.230 Port: <unset> 80/TCP TargetPort: 80/TCP Endpoints: 10.6.1.60:80,10.6.9.41:80 Session Affinity: None Events: <none>
  15. @lizrice Kubernetes view of service endpoints eks ❯ k get

    endpoints NAME ENDPOINTS AGE rebel-base 10.6.1.60:80,10.6.9.41:80 5d18h eks ❯ k get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE rebel-base-67fbdbbcb-4fpbj 1/1 Running 0 26h 10.6.1.60 ip-10-6-6-51.eu-west-1.co rebel-base-67fbdbbcb-6bpt4 1/1 Running 0 26h 10.6.9.41 ip-10-6-6-51.eu-west-1.co
  16. @lizrice eks ❯ k get svc rebel-base -o wide NAME

    TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR rebel-base ClusterIP 172.20.135.230 <none> 80/TCP 5d18h name=rebel-base eks ❯ ks exec -it $CPOD -- cilium-dbg service list ID Frontend Service Type Backend ... 6 172.20.135.230:80 ClusterIP 1 => 10.6.9.41:80 (active) 2 => 10.6.1.60:80 (active) 3 => 10.172.0.99:80 (active) 4 => 10.172.0.23:80 (active) gke ❯ k get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE rebel-base-6cf4b8c8b-5r2zq 1/1 Running 0 2d15h 10.172.0.99 gke-liz-cm-gke-5-liz-ng rebel-base-6cf4b8c8b-thh2g 1/1 Running 0 2d15h 10.172.0.23 gke-liz-cm-gke-5-liz-ng Cilium view of service endpoints Non-overlapping pod CIDRs
  17. @lizrice Global service annotation eks ❯ k describe svc rebel-base

    Name: rebel-base Namespace: farfaraway Labels: <none> Annotations: service.cilium.io/global: true Selector: name=rebel-base Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 172.20.135.230 IPs: 172.20.135.230 Port: <unset> 80/TCP TargetPort: 80/TCP Endpoints: 10.6.1.60:80,10.6.9.41:80 Session Affinity: None Events: <none>
  18. @lizrice apiVersion: "cilium.io/v2" kind: CiliumNetworkPolicy metadata: name: "ingress-to-rebel-base" namespace: "farfaraway"

    spec: description: "Only allow local xwing to contact rebel-base" endpointSelector: matchLabels: name: rebel-base ingress: - fromEndpoints: - matchLabels: class: xwing io.cilium.k8s.policy.cluster: liz-cm-eks-6 Policies can specify clusters
  19. @lizrice eks ❯ k get pods --show-labels NAME READY STATUS

    RESTARTS AGE LABELS ... xwing 1/1 Running 0 5d22h app.kubernetes.io/name=xwing, class=xwing,org=rebel-alliance eks ❯ ks exec -it $CPOD -- cilium endpoint list ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) ... 934 Disabled Disabled 448568 k8s:app.kubernetes.io/name=xwing k8s:class=xwing k8s:io.cilium.k8s.namespace.labels.kubernetes.io k8s:io.cilium.k8s.policy.cluster=liz-cm-eks-6 k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=farfaraway k8s:org=rebel-alliance 3101 Enabled Disabled 403136 k8s:io.cilium.k8s.namespace.labels.kubernetes.io k8s:io.cilium.k8s.policy.cluster=liz-cm-eks-6 k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=farfaraway k8s:name=rebel-base Implicit label
  20. @lizrice eks ❯ hubble observe --from-namespace farfaraway --to-namespace farfaraway --type

    policy-verdict Mar 18 13:32:11.321: farfaraway/xwing:59200 (ID:448568) -> farfaraway/rebel-base-67fbdbbcb-6bpt4:80 (ID:403136) policy-verdict:L3-Only INGRESS ALLOWED (TCP Flags: Mar 18 13:32:18.796: farfaraway/xwing:50208 (ID:373826) <> farfaraway/rebel-base-67fbdbbcb-4fpbj:80 (ID:403136) policy-verdict:none INGRESS DENIED (TCP Flags: SYN) Policy enforced on cluster where it is applied
  21. @lizrice eks ❯ cilium config view ... max-connected-clusters 255 Tradeoff

    max clusters ↔ max identities Max connected clusters Max security identities per cluster 255 (default) 65535 511 32767