Upgrade to Pro — share decks privately, control downloads, hide ads and more …

20250429 - CNTUG Meetup #67 / DevOps Taiwan Mee...

20250429 - CNTUG Meetup #67 / DevOps Taiwan Meetup #69 - Deep Dive into Tetragon: Building Runtime Security and Observability with eBPF

ChengHao Yang

April 29, 2025
Tweet

More Decks by ChengHao Yang

Other Decks in Programming

Transcript

  1. ChengHao Yang (tico88612) @ CNTUG Meetup #67 / DevOps Meetup

    #69 Deep Dive into Tetragon Building Runtime Security and Observability with eBPF 1
  2. ChengHao Yang (tico88612) • CNCF Ambassador • Kubestronaut • CNTUG

    meetup Co-organizer • KCD Taipei 2025 Lead organizer • Code Reviewer @ Kubespray • Release Signal Shadow @ Kubernetes v1.32 / v1.33 • 鏡 音 リン愛好者 $ whoami 2
  3. 1. 祭出 swag、各 大 社群宣傳,但依 舊沒有 人 投稿。 2. 剛好公司想要來研究這主題,就

    順便跟 大 家分享。 3. 跟 DevOps Taiwan 來交換 一 下主 題,來宣傳 一 下社群跟活動! 為什麼會有這主題? 投稿連結:https://sessionize.com/cntug-meetup/ 3
  4. 活動報名中! • 活動 日 期:7/5 (Sat.) • 地點:臺科 大 國際

    大 樓 • 除了雲端原 生 的議程以外,現場會 有從國外研討會帶回的趣味競賽 ClashLookBackO f ! • 官網:https://kcd.taipei/2025 • 購票:https://i.kcd.taipei/2025- ticket KCD Taipei 2025 5 徵稿連結 多樣性獎學 金 申請 官 方 網站 Time out! QwQ
  5. 每 月 聚會 + Call for Speaker • Cloud Native

    Taiwan User Group 是 CNCF 認可的在地社群。 • 活動皆免費*限額報名參與。 • 投稿後並完成演講,我們將會贈 送 CNCF Store 一 件禮物! • 官 方 網站:https://cloudnative.tw CNTUG meetup 6 徵稿連結 CNTUG 聚會報名 官 方 網站 * 僅收場地費
  6. eBPF-based Security Observability and Runtime Enforcement Tetragon detects and is

    able to react to security-signi fi cant events, such as: • Process execution events • System call activity • I/O activity including network & fi le access When used in a Kubernetes environment, Tetragon is Kubernetes-aware. Tetragon 8
  7. 9

  8. extended Berkeley Packet Filter • A sandboxing mechanism in the

    Linux kernel. • Runs custom code without kernel changes or reboots. • Low-overhead, high-performance, and non- intrusive visibility and control. • Used for observability, networking, tracing, and security. eBPF 10
  9. FAQ on Tetragon • Required Cilium? Required Linux kernel version?

    • You don’t need Cilium. Kernel verion 4.19 or greater (5.4, 5.10, 5.15 LTS) • Use Tetragon in standalone mode (outside of Kubernetes)? • Yes, Container or Package are available. • Run on macOS? • Yes…? But only on Virtual Machine (Lima, UTM, VirtualBox, Parallels Desktop) 11
  10. 12

  11. Tetragon Falco Company Isovalent Sysdig CNCF Project Graduate
 (Subproject of

    Cilium) Graduate Open Source ✓ (Apache-2.0) ✓ (Apache-2.0) Star 3.9k 7.8k Enforcement ✓ ✗ Event Types Process Exec,
 File Access, Network etc. Syscall-based,
 others extended by plugins Plugins / Extensions ✗ ✓ Typical Use Cases Fine-grained Observability, Zero Trust Enforcement, Runtime Security Threat Detection, Alerting,
 Incident response CKS Exam ✗ ✓ 13
  12. Process Lifecycle ContainerCreating → Running → Terminating • Tetragon observes

    process creation and termination with default con fi g: • PROCESS_EXEC: binary, arguments, UID, parent process, etc. • PROCESS_EXIT: status code, signals on process exit, etc. • Gain full process lifecycle visibility to support incident investigations. • Track process lifecycle events to detect and investigate suspicious activity. 15
  13. 16

  14. Tracing Policy Tetragon major feature • Hook points • Argument

    types • Selectors • Enforcement mode • monitoring: enforcement operations are elided • enforcement: enforcement operations are respected and performed 17
  15. Tracing Policy Hook points • Kprobe → Kernel Probe •

    Monitoring any kernel function activity • Related with the kernel version, mightn’t be portable across di ff erent kernels • Uprobe → User Probe • Use objdump, nm, or readelf to fi nd the symbol of a function in a binary fi le • Related with the binary version of the user-space program, mightn’t be portable across di ff erent versions or architecture (e.g. amd64 or arm64) 18
  16. Tracing Policy Hook points (cont.) • Tracepoint & Raw Tracepoint

    • Based on Linux Kernel Tracepoint (ftrace) • Better stability than Kprobe (but more di ffi cult) • LSM BPF (Linux Security Module) • Required Linux kernel version > 5.7 • Need to edit the boot option (/etc/default/grub) lsm=bpf 19
  17. Tracing Policy Argument & Argument types • Kernel's common data

    types are listed • sint8, int8, uint8, sint16, int16, uint16, int, sint32, int32, uint32… • string, fd, filename… • cap_inheritable, cap_permitted, cap_effective… • It depends on what the function do you use 20
  18. Tracing Policy Selector - Filter • matchArgs: fi lter on

    the value of arguments. • matchReturnArgs: fi lter on the return value. • matchPIDs: fi lter on PID. • matchBinaries: fi lter on binary path. • matchNamespaces: fi lter on Linux namespaces. • matchCapabilities: fi lter on Linux capabilities. • matchNamespaceChanges: fi lter on Linux namespaces changes. • matchCapabilityChanges: fi lter on Linux capabilities changes. 21
  19. Tracing Policy Selector - Apply an action • matchActions: apply

    an action on selector matching. • matchReturnActions: apply an action on return selector matching. 22
  20. Tracing Policy Selector - Apply an action (cont.) 23 •

    Sigkill action • Signal action • Override action • FollowFD action (Deprecated 1.5) • UnfollowFD action (Deprecated 1.5) • CopyFD action (Deprecated 1.5) • GetUrl action • DnsLookup action • Post action • NoPost action • TrackSock action • UntrackSock action • Notify Enforcer action
  21. Required fi eld Hook point Kernel Probes fd_install is not

    a syscall Just a kernel regular function If the args index 1 value is the same as /tmp/tetragon SIGKILL will be sent. 25
  22. File Access Monitoring File access traces with Tetragon • SecurityContext

    can be con fi gured the readOnlyRootFilesystem, but…? • Restrict read access (Don’t read sensitive fi les) • Know which fi les are read or write • Example: • https://github.com/cilium/tetragon/blob/main/examples/quickstart/ fi le_monitoring.yaml • https://github.com/cilium/tetragon/blob/main/examples/quickstart/ fi le_monitoring_enforce.yaml 26
  23. Network Monitoring Network access traces with Tetragon • Maybe Cilium

    NetworkPolicy can do, but… • Know what the length of data curl or wget is sending out. • Use on Docker or Bare Metel. • Tricky bug: If it is a multi-node K8s, the template it currently gives will have Pod CIDR issues. 27
  24. 28

  25. Kubernetes Identity Aware Policies TracingPolicy’s scope is cluster • Namespace

    fi lters • TracingPolicyNamespaced • Pod label fi lters • Container fi eld fi lters (current only name) 33
  26. Runtime Hooks Tetragon uses the K8s API server, but…? •

    Using the K8s API is usually slow and only “best-e ff ort”, is there a better way? • Yes, runtime hooks more e ff ective. • Required: • Containerd with NRI (support after 1.7,default enabled after 2.0)
 → nri-hook • Containerd without NRI → oci-hooks • CRI-O use OCI hook → oci-hooks 34
  27. 35

  28. Event throttling Monitor and throttle cgroup events rate • Disabled

    by default, at this moment, only for base sensor events: • PROCESS_EXEC • PROCESS_EXIT • The throttle action generates following events: • THROTTLE start event is sent when the group rate limit is crossed • THROTTLE stop event is sent when the cgroup rate is again below the limit stable for 5 seconds 36
  29. Other settings Privileged execution • Which Kubernetes pods are running

    with CAP_SYS_ADMIN in my cluster? • Which Kubernetes pods have host network or pid namespace access in my cluster? • Enable these features: • Process Credential: enable-process-cred: true • Namespace Monitoring: enable-process-ns: true • Restart the Tetragon daemonset 37
  30. Environment Setting Ideal, but… • macOS Sequoia 15.4.1 • OrbStack

    v1.10.3 • Kubernetes v1.32.2 (Kind) • Containerd v2.0.2 (NRI enabled) • Tetragon v1.4.0 38
  31. Environment Setting Reality • macOS Sequoia 15.4.1 • Parallels Desktop

    20.3.0 • Kubernetes v1.32.4 (Kubespray - master branch) • Containerd v2.0.3 (NRI enabled) • Tetragon v1.4.0 39
  32. 40 Enable Process Credential Namespace Monitoring Enable Prometheus Service Monitor

    Enable runtime hooks, use NRI Allow failed in kube-system
  33. If the next time you encounter CVEs… Maybe you can

    do…? • If you can update it immediately? • Upgrade to the latest version ASAP. • If you can't update it immediately? / Zero-day vulnerability? • Following the best practices (e.g. drop capabilities, set securitycontext) in your daily. • Set the regular TracingPolicy in your each cluster. • It's not the perfect way, but at least it's mitigation. 42
  34. Reference • Tetragon o ff i cial documentation: https://tetragon.io/ •

    Tetragon GitHub: https://github.com/cilium/tetragon/ • Tracing Policy Example: https://github.com/cilium/tetragon/tree/main/ examples • Live-patching security vulnerabilities inside the Linux kernel with eBPF Linux Security Module - Cloud fl are: https://blog.cloud fl are.com/live-patch-security- vulnerabilities-with-ebpf-lsm/ 45
  35. Please feel free to contact me! • Website: https://tico.tw •

    Facebook: @tico88612 • Instagram: @__tico88612__ • GitHub: @tico88612 • LinkedIn: in/tico88612 • Telegram: @tico88612 Thank You! 46