Upgrade to Pro — share decks privately, control downloads, hide ads and more …

eBPF for the rest of us - Golab 2023

eBPF for the rest of us - Golab 2023

EBPF is getting popular and popular. In this talk I will reintroduce it, and will describe what is the state of the art and how we can leverage it to solve problems related to performance, security and networking.

Federico Paolinelli

November 21, 2023
Tweet

More Decks by Federico Paolinelli

Other Decks in Technology

Transcript

  1. eBPF
    for the rest of us
    Federico Paolinelli - Red Hat

    View full-size slide

  2. Telco Network Team @ Red Hat
    Contributed to:
    - Athens
    - KubeVirt
    - SR-IOV Network Operator
    - OPA Gatekeeper
    - OVN-Kubernetes hachyderm.io/@fedepaol
    - CNI Plugins @fedepaol
    - MetalLB [email protected]
    About me

    View full-size slide

  3. eBPF is now mature

    View full-size slide

  4. What is eBPF?

    View full-size slide

  5. - Extended berkeley packet filter
    - Same thing as Javascript was to browsers
    - Allows us to change the behavior of the
    Linux kernel
    What is eBPF?

    View full-size slide

  6. - Extended berkeley packet filter
    - Same thing as Javascript was to browsers
    - Allows us to change the behavior of the
    Linux kernel in a safe and controlled way
    What is eBPF?

    View full-size slide

  7. eBPF programs can be
    loaded
    dynamically

    View full-size slide

  8. eBPF is
    event driven

    View full-size slide

  9. Why do we care?

    View full-size slide

  10. Regular Kernel Interaction
    App syscall
    Kernel
    Userspace HW

    View full-size slide

  11. eBPF programs live in the kernel
    App syscall
    Kernel
    Userspace HW
    eBPF
    Program

    View full-size slide

  12. Observability
    App syscall
    Kernel
    Userspace HW
    eBPF
    Program

    View full-size slide

  13. Observability
    App syscall
    Kernel
    Userspace HW
    eBPF
    Program
    App
    syscall

    View full-size slide

  14. Observability
    App syscall
    Kernel
    Userspace HW
    eBPF
    Program
    App
    syscall
    App
    syscall

    View full-size slide

  15. Networking
    App syscall
    Kernel
    Userspace HW

    View full-size slide

  16. Networking
    App syscall
    Kernel
    Userspace HW
    eBPF
    Program

    View full-size slide

  17. - Linux kernel modules
    - Contributing the feature to the kernel
    What are the alternatives?

    View full-size slide

  18. More use cases
    later
    https://flic.kr/p/2pf9kY

    View full-size slide

  19. Let’s see an example

    View full-size slide

  20. Disclaimer!
    There will be C code

    View full-size slide

  21. SEC("kprobe/sys_execve")
    int kprobe_execve() {
    bpf_printk("called!");
    return 0;
    }
    prog.bpf.c clang prog.bpf.o

    View full-size slide

  22. git-9348 [000] ...21 2534.887840: bpf_trace_printk: called!
    git-9351 [000] ...21 2534.891143: bpf_trace_printk: called!
    tail-9354 [000] ...21 2534.894813: bpf_trace_printk: called!
    git-9355 [001] ...21 2534.894813: bpf_trace_printk: called!
    /sys/kernel/debug/tracing/trace_pipe
    execve
    code
    git bash userspace
    kernel

    View full-size slide

  23. Do I really want something
    like this running in my kernel?

    View full-size slide

  24. in order to load an eBPF program,
    you need
    PERMISSIONS
    https://flic.kr/p/521mBV

    View full-size slide

  25. eBPF Verifier

    View full-size slide

  26. prog.bpf.c clang bytecode

    View full-size slide

  27. prog.bpf.c clang bytecode
    Kernel
    bytecode syscall
    verifier
    jit

    View full-size slide

  28. prog.bpf.c clang bytecode
    Kernel
    bytecode syscall
    verifier
    jit

    View full-size slide

  29. - infinite loops
    - uninitialized variables
    - memory access out of allowed bounds
    - program size (below 4096 instructions)
    - program complexity
    - allowed calls
    The eBPF verifier checks for

    View full-size slide

  30. eBPF program
    structure

    View full-size slide

  31. #include "vmlinux.h"
    #include "bpf/bpf_helpers.h"
    struct
    {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __uint(max_entries, MAX_MAP_ENTRIES);
    __type(key, __u32);
    __type(value, struct arguments);
    } xdp_params_array SEC(".maps");
    SEC("xdp")
    int xdp_prog_func(struct xdp_md *ctx)
    {
    bpf_printk("called");
    // access the map
    return XDP_TX;
    }

    View full-size slide

  32. The program
    SEC("xdp")
    int xdp_prog_func(struct xdp_md *ctx)
    {
    bpf_printk("called");
    return XDP_TX;
    }

    View full-size slide

  33. The program type
    SEC("xdp")
    int xdp_prog_func(struct xdp_md *ctx)
    {
    bpf_printk("called");
    return XDP_TX;
    }

    View full-size slide

  34. The context parameter
    SEC("xdp")
    int xdp_prog_func(struct xdp_md *ctx)
    {
    bpf_printk("called");
    return XDP_TX;
    }

    View full-size slide

  35. The return value
    SEC("xdp")
    int xdp_prog_func(struct xdp_md *ctx)
    {
    bpf_printk("called");
    return XDP_TX;
    }

    View full-size slide

  36. https://flic.kr/p/H4D7j
    eBPF Helpers

    View full-size slide

  37. - This framework differs from the older, "classic" BPF (or "cBPF") in
    several aspects, one of them being the ability to call special
    functions (or "helpers") from within a program
    - These helpers are used by eBPF programs to interact with the system,
    or with the context in which they work
    - each program type can only call a subset of those helpers
    from man bpf-helpers

    View full-size slide

  38. BPF Helpers
    Description
    Copy the comm attribute of the current task into buf of size_of_buf. The comm attribute contains the name of the
    executable (excluding the path) for the current task. The size_of_buf must be strictly positive
    SEC("kprobe/sys_execve")
    int kprobe_execve() {
    char comm[20];
    bpf_get_current_comm(comm, sizeof(comm));
    bpf_printk("execve: %s\n", comm);
    return 0;
    }

    View full-size slide

  39. BPF Helpers
    Description
    Copy the comm attribute of the current task into buf of size_of_buf. The comm attribute contains the name of the
    executable (excluding the path) for the current task. The size_of_buf must be strictly positive
    SEC("kprobe/sys_execve")
    int kprobe_execve() {
    char comm[20];
    bpf_get_current_comm(comm, sizeof(comm));
    bpf_printk("execve: %s\n", comm);
    return 0;
    }

    View full-size slide

  40. Description
    Copy the comm attribute of the current task into buf of
    size_of_buf. The comm attribute contains the name of the executable
    (excluding the path) for the current task. The size_of_buf must be strictly
    positive

    View full-size slide

  41. eBPF maps
    App
    Kernel
    Userspace
    Map

    View full-size slide

  42. Maps are for…
    - The only way to have userspace and eBPF
    programs communicate
    - Configuration
    - Saving state / share data between
    programs
    - Sending data to userspace

    View full-size slide

  43. - Array
    - HashMap
    - LRU
    - Perf / Ring Buffer
    - SocketMap
    - …
    Type of maps

    View full-size slide

  44. Maps - Hash Map
    struct command {
    u8 cmd[64];
    };
    struct{
    __uint(type, BPF_MAP_TYPE_HASH);
    __type(key, struct command);
    __type(value, struct action);
    __uint(max_entries, 1024);
    } action_cmd_map SEC(".maps");

    View full-size slide

  45. Maps - Hash Map
    SEC("kprobe/sys_execve")
    int kprobe_execve() {
    struct command key;
    memset(&key, 0, sizeof(key));
    bpf_get_current_comm(&key->cmd, sizeof(key->cmd));
    args = (struct action *)bpf_map_lookup_elem(&action_cmd_map, &key);
    if (!args) {
    return 0;
    }
    // do stuff
    return 0;
    }

    View full-size slide

  46. Maps - Hash Map
    SEC("kprobe/sys_execve")
    int kprobe_execve() {
    struct command key;
    memset(&key, 0, sizeof(key));
    bpf_get_current_comm(&key->cmd, sizeof(key->cmd));
    args = (struct action *)bpf_map_lookup_elem(&action_cmd_map, &key);
    if (!args) {
    return 0;
    }
    // do stuff
    return 0;
    }

    View full-size slide

  47. Maps - Ring buffer
    struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 1 << 24);
    } ring_buffer SEC(".maps");
    SEC("kprobe/sys_openat")
    int BPF_KPROBE(kprobe_openat, struct pt_regs *regs) {
    struct event *event = 0;
    event = bpf_ringbuf_reserve(&ring_buffer, sizeof(struct event), 0);
    // fill event
    bpf_ringbuf_submit(event, 0);
    return 0;
    }

    View full-size slide

  48. Maps - Ring buffer
    struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 1 << 24);
    } ring_buffer SEC(".maps");
    SEC("kprobe/sys_openat")
    int BPF_KPROBE(kprobe_openat, struct pt_regs *regs) {
    struct event *event = 0;
    event = bpf_ringbuf_reserve(&ring_buffer, sizeof(struct event), 0);
    // fill event
    bpf_ringbuf_submit(event, 0);
    return 0;
    }

    View full-size slide

  49. Maps - Ring buffer
    struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 1 << 24);
    } ring_buffer SEC(".maps");
    SEC("kprobe/sys_openat")
    int BPF_KPROBE(kprobe_openat, struct pt_regs *regs) {
    struct event *event = 0;
    event = bpf_ringbuf_reserve(&ring_buffer, sizeof(struct event), 0);
    // fill event
    bpf_ringbuf_submit(event, 0);
    return 0;
    }

    View full-size slide

  50. Maps - Ring buffer
    struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 1 << 24);
    } ring_buffer SEC(".maps");
    SEC("kprobe/sys_openat")
    int BPF_KPROBE(kprobe_openat, struct pt_regs *regs) {
    struct event *event = 0;
    event = bpf_ringbuf_reserve(&ring_buffer, sizeof(struct event), 0);
    // fill event
    bpf_ringbuf_submit(event, 0);
    return 0;
    }

    View full-size slide

  51. Kernel
    Kernel
    Kernel

    View full-size slide

  52. Portability
    - Different kernels might have different data
    layouts
    - The program is not aware of the memory layout
    - How can we make the same artifact run on
    different kernels without recompiling?

    View full-size slide

  53. CO-RE
    BPF CO-RE (Compile Once – Run Everywhere) is a modern approach to
    writing portable BPF applications that can run on multiple kernel
    versions and configurations without modifications and runtime source
    code compilation on the target machine.
    from nakryiko.com/posts/bpf-core-reference-guide/

    View full-size slide

  54. BTF - BPF type format
    - Kind of metadata, describing the program
    - A program has BTF information associated to it
    (i.e. which fields it wants to read)
    - The kernel comes with BTF information (i.e.
    where each field is)

    View full-size slide

  55. BPF Loader
    - When an eBPF program is loaded matches the
    program’s BTF information and the kernel’s BTF
    information
    - Provides the offset to the program
    - The kernel doesn’t care

    View full-size slide

  56. #include "vmlinux.h"
    #include "bpf/bpf_helpers.h"
    struct
    {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __uint(max_entries, MAX_MAP_ENTRIES);
    __type(key, __u32);
    __type(value, struct arguments);
    } xdp_params_array SEC(".maps");
    SEC("xdp")
    int xdp_prog_func(struct xdp_md *ctx)
    {
    bpf_printk("called");
    return XDP_TX;
    }

    View full-size slide

  57. #include "vmlinux.h"
    #include "bpf/bpf_helpers.h"
    struct
    {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __uint(max_entries, MAX_MAP_ENTRIES);
    __type(key, __u32);
    __type(value, struct arguments);
    } xdp_params_array SEC(".maps");
    SEC("xdp")
    int xdp_prog_func(struct xdp_md *ctx)
    {
    bpf_printk("called");
    return XDP_TX;
    }

    View full-size slide

  58. vmlinux.h
    bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

    View full-size slide

  59. vmlinux.h
    bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h
    #include "bpf/bpf_helpers.h"

    View full-size slide

  60. What does it have to do with
    GO?

    View full-size slide

  61. App
    Kernel
    Userspace
    Map
    With Go

    View full-size slide

  62. App
    Kernel
    Userspace
    Map
    With Go
    Cilium EBPF github.com/cilium/ebpf
    LibBPF Go github.com/aquasecurity/libbpfgo

    View full-size slide

  63. App
    Kernel
    Userspace
    Map
    With Go
    Cilium EBPF github.com/cilium/ebpf
    LibBPF Go github.com/aquasecurity/libbpfgo

    View full-size slide

  64. cilium/ebpf go

    View full-size slide

  65. - compiles C code and generates eBPF elf file
    - embeds the eBPF elf file in the single go binary
    - provides references to the eBPF maps /
    programs that are accessible from Go
    - generates Go equivalent objects of C structs
    bpf2go tool

    View full-size slide

  66. bpf2go tool
    bpf2go -type arguments lb ebpf/xdp_lb.c -- -I ./include

    View full-size slide

  67. bpf2go tool
    bpf2go -type arguments lb ebpf/xdp_lb.c -- -I ./include

    View full-size slide

  68. bpf2go tool
    bpf2go -type arguments lb ebpf/xdp_lb.c -- -I ./include

    View full-size slide

  69. bpf2go tool
    bpf2go -type arguments lb ebpf/xdp_lb.c -- -I ./include

    View full-size slide

  70. bpf2go tool
    struct arguments
    {
    __u8 dst_mac[6];
    __u32 daddr;
    __u32 saddr;
    __u32 vip;
    };
    type lbArguments struct {
    DstMac [6]uint8
    _ [2]byte
    Daddr uint32
    Saddr uint32
    Vip uint32
    }

    View full-size slide

  71. bpf2go tool
    struct arguments
    {
    __u8 dst_mac[6];
    __u32 daddr;
    __u32 saddr;
    __u32 vip;
    };
    type lbArguments struct {
    DstMac [6]uint8
    _ [2]byte
    Daddr uint32
    Saddr uint32
    Vip uint32
    }

    View full-size slide

  72. bpf2go tool
    objs := lbObjects{}
    if err := loadLbObjects(&objs, nil); err != nil {
    log.Fatalf("loading objects: %s", err)
    }
    defer objs.Close()

    View full-size slide

  73. bpf2go tool
    args := lbArguments{
    Daddr: intDest,
    Saddr: intSrc,
    DstMac: macArray,
    Vip: intVip,
    }
    objs.XdpParamsArray.Put(uint32(0), args)
    objs.XdpParamsArray.Lookup(uint32(0), &args)
    struct {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __uint(max_entries, MAX_MAP_ENTRIES);
    __type(key, __u32);
    __type(value, struct arguments);
    } xdp_params_array SEC(".maps");

    View full-size slide

  74. bpf2go tool
    args := lbArguments{
    Daddr: intDest,
    Saddr: intSrc,
    DstMac: macArray,
    Vip: intVip,
    }
    objs.XdpParamsArray.Put(uint32(0), args)
    objs.XdpParamsArray.Lookup(uint32(0), &args)
    struct {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __uint(max_entries, MAX_MAP_ENTRIES);
    __type(key, __u32);
    __type(value, struct arguments);
    } xdp_params_array SEC(".maps");

    View full-size slide

  75. bpf2go tool
    args := lbArguments{
    Daddr: intDest,
    Saddr: intSrc,
    DstMac: macArray,
    Vip: intVip,
    }
    objs.XdpParamsArray.Put(uint32(0), args)
    objs.XdpParamsArray.Lookup(uint32(0), &args)
    struct {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __uint(max_entries, MAX_MAP_ENTRIES);
    __type(key, __u32);
    __type(value, struct arguments);
    } xdp_params_array SEC(".maps");

    View full-size slide

  76. - allows to attach an eBPF program to the
    corresponding hook
    - utilities for consuming perf / ring buffers
    - kernel features discovery
    ebpf go

    View full-size slide

  77. What can we do
    with eBPF?

    View full-size slide

  78. Kernel
    inspection
    https://flic.kr/p/2mRChLy

    View full-size slide

  79. - Kprobe / Kretprobe
    - Tracepoints
    - Perf Events
    - …
    Kernel inspection

    View full-size slide

  80. - Entry / exit of any kernel function
    - API not guaranteed
    - Syscalls are reasonably stable
    KProbe / KRetprobe

    View full-size slide

  81. Checking who is opening a given file
    SEC("kprobe/sys_openat")
    int BPF_KPROBE(kprobe_openat, struct pt_regs *regs) {
    struct event *event = 0;
    event = bpf_ringbuf_reserve(&ring_buffer, sizeof(struct event), 0);
    if (!event) {
    return 0;
    }
    char *pathname;
    pathname = (char*) PT_REGS_PARM2_CORE(regs);
    bpf_probe_read_str(&event->path, sizeof(event->path), (void *) pathname);
    event->pid = bpf_get_current_pid_tgid() >> 32;
    bpf_get_current_comm(&event->command, sizeof(event->command));
    bpf_ringbuf_submit(event, 0);
    return 0;
    }

    View full-size slide

  82. Checking who is opening a given file
    SEC("kprobe/sys_openat")
    int BPF_KPROBE(kprobe_openat, struct pt_regs *regs) {
    struct event *event = 0;
    event = bpf_ringbuf_reserve(&ring_buffer, sizeof(struct event), 0);
    if (!event) {
    return 0;
    }
    char *pathname;
    pathname = (char*) PT_REGS_PARM2_CORE(regs);
    bpf_probe_read_str(&event->path, sizeof(event->path), (void *) pathname);
    event->pid = bpf_get_current_pid_tgid() >> 32;
    bpf_get_current_comm(&event->command, sizeof(event->command));
    bpf_ringbuf_submit(event, 0);
    return 0;
    }

    View full-size slide

  83. Checking who is opening a given file
    SEC("kprobe/sys_openat")
    int BPF_KPROBE(kprobe_openat, struct pt_regs *regs) {
    struct event *event = 0;
    event = bpf_ringbuf_reserve(&ring_buffer, sizeof(struct event), 0);
    if (!event) {
    return 0;
    }
    char *pathname;
    pathname = (char*) PT_REGS_PARM2_CORE(regs);
    bpf_probe_read_str(&event->path, sizeof(event->path), (void *) pathname);
    event->pid = bpf_get_current_pid_tgid() >> 32;
    bpf_get_current_comm(&event->command, sizeof(event->command));
    bpf_ringbuf_submit(event, 0);
    return 0;
    }

    View full-size slide

  84. Checking who is opening a given file
    SEC("kprobe/sys_openat")
    int BPF_KPROBE(kprobe_openat, struct pt_regs *regs) {
    struct event *event = 0;
    event = bpf_ringbuf_reserve(&ring_buffer, sizeof(struct event), 0);
    if (!event) {
    return 0;
    }
    char *pathname;
    pathname = (char*) PT_REGS_PARM2_CORE(regs);
    bpf_probe_read_str(&event->path, sizeof(event->path), (void *) pathname);
    event->pid = bpf_get_current_pid_tgid() >> 32;
    bpf_get_current_comm(&event->command, sizeof(event->command));
    bpf_ringbuf_submit(event, 0);
    return 0;
    }

    View full-size slide

  85. Kill who is opening a given file
    SEC("kprobe/sys_openat")
    int BPF_KPROBE(kprobe_openat, struct pt_regs *regs) {
    struct file_path p;
    memset(&p, 0, sizeof(p));
    char *pathname;
    pathname = (char *)PT_REGS_PARM2_CORE(regs);
    bpf_probe_read_str(p.pp, sizeof(p.pp), pathname);
    struct action *args = 0;
    args = (struct action *)bpf_map_lookup_elem(&action_file_map, &p);
    if (!args) {
    return 0;
    }
    if (args->kill) {
    bpf_send_signal(9);
    }
    return 0;
    }

    View full-size slide

  86. Kill who is opening a given file
    SEC("kprobe/sys_openat")
    int BPF_KPROBE(kprobe_openat, struct pt_regs *regs) {
    struct file_path p;
    memset(&p, 0, sizeof(p));
    char *pathname;
    pathname = (char *)PT_REGS_PARM2_CORE(regs);
    bpf_probe_read_str(p.pp, sizeof(p.pp), pathname);
    struct action *args = 0;
    args = (struct action *)bpf_map_lookup_elem(&action_file_map, &p);
    if (!args) {
    return 0;
    }
    if (args->kill) {
    bpf_send_signal(9);
    }
    return 0;
    }

    View full-size slide

  87. - Specific hooks in the kernel
    - Guaranteed to be stable
    - BTF generated data structures for parameters
    Tracepoints

    View full-size slide

  88. Tracepoints
    more /sys/kernel/tracing/available_events | grep skb
    tcp:tcp_retransmit_skb
    udp:udp_fail_queue_rcv_skb
    net:netif_receive_skb_list_exit
    net:netif_receive_skb_exit
    net:netif_receive_skb_list_entry
    net:netif_receive_skb_entry
    net:netif_receive_skb
    skb:skb_copy_datagram_iovec
    skb:consume_skb
    skb:kfree_skb

    View full-size slide

  89. Tracepoints
    /*
    * Tracepoint for free an sk_buff:
    */
    TRACE_EVENT(kfree_skb,
    TP_PROTO(struct sk_buff *skb, void *location,
    enum skb_drop_reason reason),
    TP_ARGS(skb, location, reason),
    TP_STRUCT__entry(
    __field(void *, skbaddr)
    __field(void *, location)
    __field(unsigned short, protocol)
    __field(enum skb_drop_reason, reason)
    ),
    TP_printk("skbaddr=%p protocol=%u location=%pS reason: %s",
    __entry->skbaddr, __entry->protocol, __entry->location,
    __print_symbolic(__entry->reason,
    DEFINE_DROP_REASON(FN, FNe)))
    );
    include/trace/events/skb.h

    View full-size slide

  90. Tracepoints
    /*
    * Tracepoint for free an sk_buff:
    */
    TRACE_EVENT(kfree_skb,
    TP_PROTO(struct sk_buff *skb, void *location,
    enum skb_drop_reason reason),
    TP_ARGS(skb, location, reason),
    TP_STRUCT__entry(
    __field(void *, skbaddr)
    __field(void *, location)
    __field(unsigned short, protocol)
    __field(enum skb_drop_reason, reason)
    ),
    TP_printk("skbaddr=%p protocol=%u location=%pS reason: %s",
    __entry->skbaddr, __entry->protocol, __entry->location,
    __print_symbolic(__entry->reason,
    DEFINE_DROP_REASON(FN, FNe)))
    );
    include/trace/events/skb.h
    trace_kfree_skb

    View full-size slide

  91. Checking what packets are being dropped
    SEC("tp_btf/skb/kfree_skb")
    int kfree_skb(struct trace_event_raw_kfree_skb *args) {
    struct sk_buff skb;
    __builtin_memset(&skb, 0, sizeof(skb));
    bpf_probe_read(&skb, sizeof(struct sk_buff), args->skbaddr);
    struct sock *sk = skb.sk;
    enum skb_drop_reason reason = args->reason;
    /* handle the event… */
    }

    View full-size slide

  92. Checking what packets are being dropped
    SEC("tp_btf/skb/kfree_skb")
    int kfree_skb(struct trace_event_raw_kfree_skb *args) {
    struct sk_buff skb;
    __builtin_memset(&skb, 0, sizeof(skb));
    bpf_probe_read(&skb, sizeof(struct sk_buff), args->skbaddr);
    struct sock *sk = skb.sk;
    enum skb_drop_reason reason = args->reason;
    /* handle the event… */
    }

    View full-size slide

  93. Userspace
    inspection!

    View full-size slide

  94. UProbe / URetprobe
    - same as kprobe / kretprobe but for userspace
    - must be attached to the binary
    - even less stability guarantees
    - might be useful for well known, stable libraries

    View full-size slide

  95. Intercepting SSL_write calls
    SEC("uprobe/SSL_write")
    int probe_entry_SSL_write(struct pt_regs *ctx)
    {
    u64 current_pid_tgid = bpf_get_current_pid_tgid();
    u32 pid = current_pid_tgid >> 32;
    const char *buf = (const char *)PT_REGS_PARM2(ctx);
    struct active_ssl_buf active_ssl_buf_t;
    active_ssl_buf_t.buf = (uintptr_t)buf;
    bpf_map_update_elem(&active_ssl_write_args_map, &current_pid_tgid,
    &active_ssl_buf_t, BPF_ANY);
    return 0;
    }

    View full-size slide

  96. Intercepting SSL_write calls
    SEC("uprobe/SSL_write")
    int probe_entry_SSL_write(struct pt_regs *ctx)
    {
    u64 current_pid_tgid = bpf_get_current_pid_tgid();
    u32 pid = current_pid_tgid >> 32;
    const char *buf = (const char *)PT_REGS_PARM2(ctx);
    struct active_ssl_buf active_ssl_buf_t;
    active_ssl_buf_t.buf = (uintptr_t)buf;
    bpf_map_update_elem(&active_ssl_write_args_map, &current_pid_tgid,
    &active_ssl_buf_t, BPF_ANY);
    return 0;
    }

    View full-size slide

  97. Intercepting SSL_write calls
    SEC("uretprobe/SSL_write")
    int probe_ret_SSL_write(struct pt_regs *ctx)
    {
    struct active_ssl_buf *active_ssl_buf_t =
    bpf_map_lookup_elem(&active_ssl_write_args_map,
    &current_pid_tgid);
    const char *buf;
    bpf_probe_read(&buf, sizeof(const char *), &active_ssl_buf_t->buf);
    process_SSL_data(ctx, current_pid_tgid, buf);
    bpf_map_delete_elem(&active_ssl_write_args_map, &current_pid_tgid);
    return 0;
    }

    View full-size slide

  98. Intercepting SSL_write calls
    SEC("uretprobe/SSL_write")
    int probe_ret_SSL_write(struct pt_regs *ctx)
    {
    struct active_ssl_buf *active_ssl_buf_t =
    bpf_map_lookup_elem(&active_ssl_write_args_map,
    &current_pid_tgid);
    const char *buf;
    bpf_probe_read(&buf, sizeof(const char *), &active_ssl_buf_t->buf);
    process_SSL_data(ctx, current_pid_tgid, buf);
    bpf_map_delete_elem(&active_ssl_write_args_map, &current_pid_tgid);
    return 0;
    }

    View full-size slide

  99. Intercepting SSL_write calls
    SEC("uretprobe/SSL_write")
    int probe_ret_SSL_write(struct pt_regs *ctx)
    {
    struct active_ssl_buf *active_ssl_buf_t =
    bpf_map_lookup_elem(&active_ssl_write_args_map,
    &current_pid_tgid);
    const char *buf;
    bpf_probe_read(&buf, sizeof(const char *), &active_ssl_buf_t->buf);
    process_SSL_data(ctx, current_pid_tgid, buf);
    bpf_map_delete_elem(&active_ssl_write_args_map, &current_pid_tgid);
    return 0;
    }

    View full-size slide

  100. Networking
    Networking

    View full-size slide

  101. Networking
    - packet filtering
    - packet manipulation
    - socket redirection

    View full-size slide

  102. XDP - Express Data Path
    - hook for ingress packets only
    - very early in the stack
    - can pass to the linux kernel, drop, redirect
    - pointer-fu to navigate the frame manually
    Eth Header IP Header Payload

    View full-size slide

  103. XDP
    SEC("xdp")
    int xdp_only_tcp(struct xdp_md *ctx) {
    void *data = (void *)(long)ctx->data;
    void *data_end = (void *)(long)ctx->data_end;
    struct ethhdr *eth = data;
    __u32 eth_proto;
    eth_proto = eth->h_proto;
    iph = data + sizeof(struct ethhdr);
    if ((iph + 1) > data_end) {
    return XDP_DROP;
    }
    if (iph->protocol != IPPROTO_TCP) {
    return XDP_DROP;
    }
    return XDP_PASS;
    }

    View full-size slide

  104. XDP
    SEC("xdp")
    int xdp_only_tcp(struct xdp_md *ctx) {
    void *data = (void *)(long)ctx->data;
    void *data_end = (void *)(long)ctx->data_end;
    struct ethhdr *eth = data;
    __u32 eth_proto;
    eth_proto = eth->h_proto;
    iph = data + sizeof(struct ethhdr);
    if ((iph + 1) > data_end) {
    return XDP_DROP;
    }
    if (iph->protocol != IPPROTO_TCP) {
    return XDP_DROP;
    }
    return XDP_PASS;
    }

    View full-size slide

  105. XDP
    SEC("xdp")
    int xdp_only_tcp(struct xdp_md *ctx) {
    void *data = (void *)(long)ctx->data;
    void *data_end = (void *)(long)ctx->data_end;
    struct ethhdr *eth = data;
    __u32 eth_proto;
    eth_proto = eth->h_proto;
    iph = data + sizeof(struct ethhdr);
    if ((iph + 1) > data_end) {
    return XDP_DROP;
    }
    if (iph->protocol != IPPROTO_TCP) {
    return XDP_DROP;
    }
    return XDP_PASS;
    }

    View full-size slide

  106. XDP
    SEC("xdp")
    int xdp_only_tcp(struct xdp_md *ctx) {
    void *data = (void *)(long)ctx->data;
    void *data_end = (void *)(long)ctx->data_end;
    struct ethhdr *eth = data;
    __u32 eth_proto;
    eth_proto = eth->h_proto;
    iph = data + sizeof(struct ethhdr);
    if ((iph + 1) > data_end) {
    return XDP_DROP;
    }
    if (iph->protocol != IPPROTO_TCP) {
    return XDP_DROP;
    }
    return XDP_PASS;
    }

    View full-size slide

  107. XDP
    SEC("xdp")
    int xdp_only_tcp(struct xdp_md *ctx) {
    void *data = (void *)(long)ctx->data;
    void *data_end = (void *)(long)ctx->data_end;
    struct ethhdr *eth = data;
    __u32 eth_proto;
    eth_proto = eth->h_proto;
    iph = data + sizeof(struct ethhdr);
    if ((iph + 1) > data_end) {
    return XDP_DROP;
    }
    if (iph->protocol != IPPROTO_TCP) {
    return XDP_DROP;
    }
    return XDP_PASS;
    }

    View full-size slide

  108. TC - Traffic Control Acton
    - packet filtering
    - packet manipulation
    - works with ingress and egress
    - packets under the form of _sk_buff

    View full-size slide

  109. TC
    SEC("tc_redirect")
    int redirect(struct __sk_buff *skb)
    {
    void *data = (void *)(unsigned long long)skb->data;
    if (bpf_ntohs(eth->h_proto) != ETH_P_IP)
    return TC_ACT_SHOT;
    key = bpf_ntohl(iph->saddr);
    nextHop = bpf_map_lookup_elem(&redirect_map_ipv4, &key);
    if (nextHop != NULL) {
    neighInfo.ipv4_nh = bpf_htonl(nextHop->nextHop);
    neighInfo.nh_family = AF_INET;
    long res = bpf_redirect_neigh(nextHop->interfaceID, &neighInfo,
    sizeof(neighInfo), 0);
    return res;
    }
    return TC_ACT_OK;
    }

    View full-size slide

  110. TC
    SEC("tc_redirect")
    int redirect(struct __sk_buff *skb)
    {
    void *data = (void *)(unsigned long long)skb->data;
    if (bpf_ntohs(eth->h_proto) != ETH_P_IP)
    return TC_ACT_SHOT;
    key = bpf_ntohl(iph->saddr);
    nextHop = bpf_map_lookup_elem(&redirect_map_ipv4, &key);
    if (nextHop != NULL) {
    neighInfo.ipv4_nh = bpf_htonl(nextHop->nextHop);
    neighInfo.nh_family = AF_INET;
    long res = bpf_redirect_neigh(nextHop->interfaceID, &neighInfo,
    sizeof(neighInfo), 0);
    return res;
    }
    return TC_ACT_OK;
    }

    View full-size slide

  111. TC
    SEC("tc_redirect")
    int redirect(struct __sk_buff *skb)
    {
    void *data = (void *)(unsigned long long)skb->data;
    if (bpf_ntohs(eth->h_proto) != ETH_P_IP)
    return TC_ACT_SHOT;
    key = bpf_ntohl(iph->saddr);
    nextHop = bpf_map_lookup_elem(&redirect_map_ipv4, &key);
    if (nextHop != NULL) {
    neighInfo.ipv4_nh = bpf_htonl(nextHop->nextHop);
    neighInfo.nh_family = AF_INET;
    long res = bpf_redirect_neigh(nextHop->interfaceID, &neighInfo,
    sizeof(neighInfo), 0);
    return res;
    }
    return TC_ACT_OK;
    }

    View full-size slide

  112. Attaching the program
    link, _ := link.Kprobe("sys_openat", objs.KprobeOpenat, nil)
    kp, _ := link.Tracepoint("syscalls", "sys_enter_openat", objs.HandleOpenat, nil)
    ex, _ := link.OpenExecutable(*openSSLPath)
    up, _ := ex.Uprobe("SSL_write", objs.ProbeEntrySSL_write, nil)

    View full-size slide

  113. Interacting with the Maps
    rd, _ := ringbuf.NewReader(objs.openMaps.RingBuffer)
    _ := objs.ActionFileMap.Put(key, action)
    iterator := objs.ActionFileMap.Iterate()

    View full-size slide

  114. Making the verifier happy is
    an art

    View full-size slide

  115. Missed check on XDP packet
    invalid access to packet, off=12 size=2, R9(id=0,off=12,r=0): R9 offset is
    outside of the packet (9 line(s) omitted)
    load program: permission denied: 16: (71) r2 = *(u8 *)(r1 +63): R1
    invalid mem access 'scalar' (24 line(s) omitted)
    Using memcpy instead of bpf_probe_read_str

    View full-size slide

  116. Each program type is a micro-framework
    - different context argument
    - different meaning of return values
    - different eBPF helpers available
    - different ways to load the program
    - different lifecycles

    View full-size slide

  117. we need be familiar with
    the kernel

    View full-size slide

  118. There is no debugger
    - Did we attach the right program?
    - Are we passing the parameters correctly?
    - Are we parsing the arguments correctly?
    - How about packet manipulation?

    View full-size slide

  119. Older kernels
    support!

    View full-size slide

  120. Older kernels
    support!
    https://flic.kr/p/8RyQBM

    View full-size slide

  121. How can we
    debug?

    View full-size slide

  122. Log all the things!
    bpf_printk("openssl process_SSL_data len :%d buf %s\n", len, buf);
    sudo cat /sys/kernel/debug/tracing/trace_pipe | more
    sudo-28199 [009] ...21 5407.812081: bpf_trace_printk: got event /etc/passwd
    sudo-28199 [009] ...21 5407.812205: bpf_trace_printk: got event /etc/login.defs
    sudo-28199 [009] ...21 5407.812882: bpf_trace_printk: got event /usr/share/login.defs.d
    sudo-28199 [009] ...21 5407.812906: bpf_trace_printk: got event /etc/login.defs.d
    systemd-journal-940 [003] ...21 5407.813073: bpf_trace_printk: got event /proc/28199/comm
    sudo-28199 [009] ...21 5407.813139: bpf_trace_printk: got event /etc/security/pam_env.conf
    sudo-28199 [009] ...21 5407.813206: bpf_trace_printk: got event /etc/environment
    systemd-journal-940 [003] ...21 5407.813206: bpf_trace_printk: got event /proc/28199/cmdline
    sudo-28199 [009] ...21 5407.813250: bpf_trace_printk: got event /etc/login.defs

    View full-size slide

  123. BPFTool to the rescue
    bpftool prog show
    ...
    393: kprobe name probe_entry_SSL_write tag 758445bff28b440d
    gpl
    loaded_at 2023-11-06T23:11:03+0100 uid 0
    xlated 368B jited 220B memlock 4096B map_ids 150,151,152
    btf_id 291
    pids uprobessl(31240)

    View full-size slide

  124. BPFTool to the rescue
    bpftool prog dump xlated name probe_entry_SSL_write
    int probe_entry_SSL_write(struct pt_regs * ctx):
    ; int probe_entry_SSL_write(struct pt_regs *ctx)
    0: (bf) r6 = r1
    ; u64 current_pid_tgid = bpf_get_current_pid_tgid();
    1: (85) call bpf_get_current_pid_tgid#197680
    2: (bf) r8 = r0
    ; u64 current_pid_tgid = bpf_get_current_pid_tgid();
    3: (7b) *(u64 *)(r10 -8) = r8
    4: (b7) r7 = 0

    View full-size slide

  125. bpftool map dump name params_array
    [{
    "key": 0,
    "value": {
    "pid": 4504
    }
    }
    ]
    BPFTool to the rescue

    View full-size slide

  126. Check for tracepoints
    ➜ ~ sudo more
    /sys/kernel/tracing/available_events | grep xdp
    xdp:mem_return_failed
    xdp:mem_connect
    xdp:mem_disconnect
    xdp:xdp_devmap_xmit
    xdp:xdp_cpumap_enqueue
    xdp:xdp_cpumap_kthread
    xdp:xdp_redirect_map_err
    xdp:xdp_redirect_map
    xdp:xdp_redirect_err
    xdp:xdp_redirect
    xdp:xdp_bulk_tx
    xdp:xdp_exception

    View full-size slide

  127. Check for tracepoints
    sudo bpftrace -e 'tracepoint:xdp:* { @cnt[probe] = count(); }'
    Attaching 12 probes...
    ^C
    @cnt[tracepoint:xdp:xdp_bulk_tx]: 10
    bpftrace -e \
    'tracepoint:xdp:xdp_bulk_tx{@redir_errno[-args->err] = count();}'
    Attaching 1 probe...
    ^C
    @redir_errno[6]: 2

    View full-size slide

  128. Test in isolation
    https://flic.kr/p/ZmCcxB

    View full-size slide

  129. Don’t be afraid to look into the kernel

    View full-size slide

  130. Take “inspiration”
    https://ebpf.io/applications/
    https://flic.kr/p/416Jh

    View full-size slide

  131. https://flic.kr/p/8Vzaim
    What real projects look like

    View full-size slide

  132. A small eBPF layer and a large userspace application

    View full-size slide

  133. A small eBPF layer and a large userspace application
    Map

    View full-size slide

  134. A small eBPF layer and a large userspace application
    Map

    View full-size slide

  135. A small eBPF layer and a large userspace application

    View full-size slide

  136. A small eBPF layer and a large userspace application

    View full-size slide

  137. A small eBPF layer and a large userspace application

    View full-size slide

  138. A small eBPF layer and a large userspace application

    View full-size slide

  139. The userspace side is in charge of
    - where to attach the program
    - where to store the events
    - present the data to the user
    - what parameters to pass to the program

    View full-size slide

  140. Wrapping up
    - A kernel side in C, a userspace side in Go
    - You need to be familiar with the kernel!
    - Maps as an API
    - Every program type is different
    - It’s powerful

    View full-size slide

  141. Wrapping up
    - A kernel side in C, a userspace side in Go
    - You need to be familiar with the kernel!
    - Maps as an API
    - Every program type is different
    - It’s powerful
    - It’s difficult to tame

    View full-size slide

  142. Resources
    - Liz Rice’s “Learning eBPF” book
    - ebpf.io
    - docs.kernel.org/bpf/index.html
    - ebpf.io/applications
    - github.com/cilium/ebpf/tree/main/examples

    View full-size slide

  143. Thanks!
    Any questions?
    @fedepaol
    hachyderm.io/@fedepaol
    [email protected]
    Slides at: speakerdeck.com/fedepaol [email protected]

    View full-size slide