Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kernel Exploitation

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for yuawn yuawn
December 19, 2020

Kernel Exploitation

Avatar for yuawn

yuawn

December 19, 2020
Tweet

More Decks by yuawn

Other Decks in Research

Transcript

  1. Outline • Linux kernel Concepts & Debug • Kenrel Protection

    • smep, smap, kaslr, kpti • Leak & useful structures • tty_struct, shm_ f i le_data, msg_msg ... • Kernel Common Vulnerability • double fetch • race condition • Exploitation & Tricks • ret2usr - bypass smep, smap, kpti • signal handler • modprobe_path • userfaultfd, setxattr, msgsnd & msgrcv, ... ROOT
  2. Kernel • Software • implemented syscalls • Communicate with hardware

    • Intel CPU ring model: • ring 0, ring 1, ring 2, ring 3 • ring 0: kernel space • ring 3: user space kernel OS Applications hardware: CPU, Memory, Disk, Devices
  3. Kernel - LKM • Loadable kernel module • Programs running

    in kernel space • drivers • kernel extensions kernel OS Applications hardware: CPU, Memory, Disk, Devices kernel module
  4. Kernel - functions • • • • • copy_from_user(void *to,

    const void __user *from, unsigned long n) copy_to_user(void __user *to, const void *from, unsigned long n) printk() kmalloc() kfree()
  5. Kernel Pwn • kernel exploitation • Privilege Escalation - 提權

    • linux root account • user space to kernel space • ring 3 -> ring 0
  6. Kernel Pwn • Modi f i ed kernel • Mobile:

    Android, ARM TrustZone • IoT • Kernel Module • Original Linux Kernel • CVE Hypervisor ARM Trusted Firmware APPs Trusted APPs OS (kernel) Trusted OS
  7. CTF Prepare - f i les • bzImage - kernel

    • initramfs.cpio.gz - f i le system • run.sh - shell script for running qemu
  8. CTF Prepare - running script • run.sh • How to

    run with qemu • check kernel protections: • smep, smap, kaslr • #!/bin/bash qemu-system-x86_64 \ -kernel ./bzImage \ -initrd ./initramfs.cpio.gz \ -nographic \ -monitor none \ -cpu qemu64,+smep,+smap \ -append "console=ttyS0 kaslr panic=1" \ -no-reboot \ -m 256M https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt
  9. CTF Prepare - kernel • bzImage • https://github.com/torvalds/linux/blob/master/scripts/extract-vmlinux • $

    ./extract-vmlinux.sh bzImage > vmlinux • vmlinux: ELF 64-bit LSB executable, x86-64 • static analysis • Find kernel ROP gadgets: • $ ropper --nocolor -- f i le ./vmlinux > rop
  10. CTF Prepare - f i le system • initramfs.cpio.gz •

    $ gunzip initramfs.cpio.gz && cpio -idv < initramfs.cpio • Gain f i le system • /challenge.ko - kernel module • /init • / f l ag • -r-------- 1 root 0 / f l ag
  11. CTF Prepare - init f i le • /init •

    觀察題⽬初始化設置 • insmod challenge.ko • echo 1 > /proc/sys/kernel/kptr_restrict • 0 -> no restrictions • 1 -> User 要有 CAP_SYSLOG 權限 (root) • 2 -> 不管權限如何 %pK 通通換成 0 • echo 1 > /proc/sys/kernel/dmesg_restrict • 0 -> no restrictions • 1 -> User 要有 CAP_SYSLOG 權限 (root)
  12. CTF Prepare - .ko • /challenge.ko • 題⽬ binary 主體

    • 通常會實作成⼀隻 miscdeivce 或 driver,可以透過 open 來使⽤ ioctl 或是 f i le operations 和 kernel module 溝通與使⽤操作。 • /dev/challenge • /proc/challenge
  13. CTF Prepare - exploit • 使⽤ C (或其他語⾔) 撰寫,編譯出 exploit

    binary • 上傳到 remote server 的 qemu vm 中 • printf "" >> exp • echo "" | base64 -d >> exp • $ musl-gcc -static exp.c -o exp • 輕量 libc,縮⼩ exp binary ⼤⼩,減少上傳時間 • $ sudo apt install musl-tools • shell 執⾏ exploit
  14. Debug • $ dmesg • printk() • $ cat /proc/kallsyms

    • function symbol • f i nd kernel base: • $ cat /proc/kallsyms | grep _text | head -n 1 • $ cat /proc/modules • f i nd kernel module base: • $ cat /proc/modules | grep <module name> • $ cat /proc/slabinfo
  15. Debug • 修改 init 檔案,獲得 root 權限,⽅便 debug • setsid

    cttyhack setuidgid 1000 sh • setsid cttyhack setuidgid 0 sh • 打包回去給 qemu 跑: • $ f i nd . -print0 | cpio --null -ov --format=newc > rootfs.cpio 2>/dev/null
  16. Debug • run.sh qemu-system-x86_64 • -s 開 gdb debug port

    在預設 port 1234 • -S 停在整個 CPU 執⾏的⼀開始 • -append "nokaslr" 關掉 kaslr • gdb 連上去 • target remote localhost:1234 • add-symbol- f i le challenge.ko baseaddr • add-symbol- f i le vmlinux baseaddr
  17. smep • Supervisor Mode Execution Protection • 在 ring 0

    kernel mode 底下,不能執⾏ user space 的 code • 記錄在 cr4 register 中
  18. smap • Supervisor Mode Access Protection • 和 smep 類似,在

    ring 0 kernel mode 底下,不能存取 user space memory • 記錄在 cr4 register 中
  19. kpti • kernel page-table isolation • cr3 register • page-table

    entry • 效果類似 smep + smap • Meltdown, Spectre • $ dmesg | grep "Kernel/User page tables isolation: enabled"
  20. Leak • CVE • 1 day • 0 day •

    Vulnerable implementation in kernel module • 概念和⼀般情形相同 • uninitialize • oob or arbitrary read write • race condition • ...
  21. Leak • 如果有未清空未初始化等漏洞,則可以透過 copy_to_user() 將 address 寫回 user space,leak 出殘留內容。

    • 如未對於請求的 kernel heap 清空或初始化,會希望在 user space (exploit binary) 中, 做⼀些會在 kernel 中請求記憶體 (kernel heap) 的操作使得 kernel heap 上殘留 kernel, kernel stack, kernel heap 等 address,使得後續 kernel module 拿到的 heap chunk 上 有這些資訊。 • 概念類似 heap exploitation,free 出 unsorted bin,使 heap 上出現 libc address。 • ⽬標找出類似這樣概念好⽤的 kernel structure 來分配。
  22. useful structure for leak • ⽅便從 user space 中控制分配與釋放 •

    size 合適,⽅便後續⾼機率拿到同⼀塊。 • heap spray • 視情況是否可以控制 kmalloc() 的 size • structure 中滿滿的 pointers • kernel text, heap, stack • vtable: function pointers
  23. useful structure for leak • tty_struct (0x2e0, 02c0) base, heap

    • shm_ f i le_data (0x20) base, heap • seq_operations (0x20) base • msg_msg (0x30〜0x1000) heap • subprocess_info (0x60) base, heap • ... • 任何發覺可使⽤的 structure • structure size 會因為 kernel version 變動 • https://ptr-yudai.hatenablog.com/entry/2020/03/16/165628
  24. tty_struct • Leak: base, heap • Size: 0x2c0 • size

    少⽤,好拿取到同⼀塊 • • ⽅便從 user land trigger • tty_operations • 類似 vtable,存了許多對於 tty 操作對應的 function pointer 如 open, write, close, ioctl ... • 如果可以 UAF 等,可以透過 overwrite struct tty_operations 指向可控區域,如修改 write operation pointer,對 tty 做 write(pfd, ,) 時,則可以控制 kernel rip int pfd = open("/dev/ptmx", O_RDWR|O_NOCTTY); struct tty_struct { int magic; struct kref kref; struct device *dev; struct tty_driver *driver; const struct tty_operations *ops; int index; /* Protects ldisc changes: Lock tty not pty */ struct ld_semaphore ldisc_sem; struct tty_ldisc *ldisc; ...
  25. tty_operations struct tty_operations { struct tty_struct * (*lookup)(struct tty_driver *driver,

    struct file *filp, int idx); int (*install)(struct tty_driver *driver, struct tty_struct *tty); void (*remove)(struct tty_driver *driver, struct tty_struct *tty); int (*open)(struct tty_struct * tty, struct file * filp); void (*close)(struct tty_struct * tty, struct file * filp); void (*shutdown)(struct tty_struct *tty); void (*cleanup)(struct tty_struct *tty); int (*write)(struct tty_struct * tty, const unsigned char *buf, int count); int (*put_char)(struct tty_struct *tty, unsigned char ch); void (*flush_chars)(struct tty_struct *tty); int (*write_room)(struct tty_struct *tty); int (*chars_in_buffer)(struct tty_struct *tty); int (*ioctl)(struct tty_struct *tty, unsigned int cmd, unsigned long arg); .... • include/linux/tty_driver.h
  26. • heap spray tty_struct for( int i = 0 ;

    i < 0x100 ; ++i ) pfd[i] = open( "/dev/ptmx" , O_RDWR | O_NOCTTY ); // tty_struct for( int i = 0 ; i < 0x100 ; ++i ) close(pfd[i]);
  27. double fetch • kernel space 與 user space 間 race

    condition • kernel 存取兩次來⾃ user space 的 data,產⽣ race condition 的空隙
  28. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user()
  29. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user() 2nd fetch (use) check 
 0x30 < 0x100 True
  30. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) memory 0x401000: 0x30 syscall/ioctl copy_from_user() check 
 0x30 < 0x100 True memory 0x401000: 0x30 copy_from_user() 2nd fetch (use)
  31. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user()
  32. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user() 2nd fetch (use) check 
 0x30 < 0x100 True
  33. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) memory 0x401000: 0x1000 syscall/ioctl copy_from_user() check 
 0x30 < 0x100 True memory 0x401000: 0x1000 modify user data 2nd fetch (use)
  34. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) 2nd fetch (use) syscall/ioctl copy_from_user() check 
 0x30 < 0x100 True memory 0x401000: 0x1000 modify user data copy_from_user() 0x1000 < 0x100 memory 0x401000: 0x1000
  35. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) 2nd fetch (use) syscall/ioctl copy_from_user() check 
 0x30 < 0x100 True memory 0x401000: 0x1000 modify user data copy_from_user() 0x1000 < 0x100 False memory 0x401000: 0x1000
  36. memory 0x401000: 0x1000 double fetch User Space Kernel Space Program

    kernel module 1st fetch (check) 2nd fetch (use) syscall/ioctl copy_from_user() check 
 0x30 < 0x100 True memory 0x401000: 0x1000 modify user data copy_from_user() 0x1000 < 0x100 False Pwned ☠
  37. ret2user • user mode 不能 access kernel space,kernel mode 可以

    access user space • 在 kernel mode 執⾏時 return 到 user space,帶著 ring 0 特權執⾏ user code • control kernel rip • Status Switch • user space to kernel space • kernel space to user space • arch/x86/entry/entry_64.S
  38. ret2user • Status Switch • kernel space to user space

    • Restore GS value by swapgs instruction • iret instruction • iretq • Stored register value at stack user cs iretq user space rip user r f l ags user sp user ss kernel rsp swapgs ; ret
  39. ret2user • Status Switch • kernel space to user space

    • Save status size_t user_cs, user_ss, user_rflags, user_sp; void save_status() { __asm__("mov user_cs, cs;" "mov user_ss, ss;" "mov user_sp, rsp;" "pushf;" "pop user_rflags;" ); puts("[*]status has been saved."); }
  40. Bypass smep • ROP • 類似 bypass NX 的概念 •

    將 kernel stack rsp 搬到 user space 上做 ROP • 直接在 kernel 中 ROP
  41. Bypass smap • ROP dead • Disallows explicit supervisor-mode data

    accesses to user-mode pages • how do copy_from_user() and copy_to_user() work?
  42. Bypass smap • arch/x86/lib/copy_user_64.S • instruction • stac • clac

    SYM_FUNC_START(copy_user_generic_unrolled) ASM_STAC cmpl $8,%edx jb 20f /* less then 8 bytes, go to byte copy loop */ ALIGN_DESTINATION movl %edx,%ecx andl $63,%edx shrl $6,%ecx jz .L_copy_short_string ... movl %edx,%ecx 21: movb (%rsi),%al 22: movb %al,(%rdi) incq %rsi incq %rdi decl %ecx jnz 21b 23: xor %eax,%eax ASM_CLAC ret
  43. Bypass smap • arch/x86/lib/copy_user_64.S • instruction • stac • clac

    SYM_FUNC_START(copy_user_generic_unrolled) ASM_STAC cmpl $8,%edx jb 20f /* less then 8 bytes, go to byte copy loop */ ALIGN_DESTINATION movl %edx,%ecx andl $63,%edx shrl $6,%ecx jz .L_copy_short_string ... movl %edx,%ecx 21: movb (%rsi),%al 22: movb %al,(%rdi) incq %rsi incq %rdi decl %ecx jnz 21b 23: xor %eax,%eax ASM_CLAC ret
  44. Bypass smap • arch/x86/lib/copy_user_64.S • instruction • stac - allow

    • clac - disallow SYM_FUNC_START(copy_user_generic_unrolled) ASM_STAC cmpl $8,%edx jb 20f /* less then 8 bytes, go to byte copy loop */ ALIGN_DESTINATION movl %edx,%ecx andl $63,%edx shrl $6,%ecx jz .L_copy_short_string ... movl %edx,%ecx 21: movb (%rsi),%al 22: movb %al,(%rdi) incq %rsi incq %rdi decl %ecx jnz 21b 23: xor %eax,%eax ASM_CLAC ret
  45. Bypass smap • instruction • stac - allow • clac

    - disallow • Only allowed in kernel mode, they fault in user-space.
  46. Bypass smap • Overwrite cr4 register • 在 kernel 中先做些事

    (ROP, ...),把 cr4 register 寫掉 -> 0x6f0 • 關掉 smep, smap 再 return 回 user space 執⾏或 ROP
  47. ret2user • Constraints • bypass smep • 將 kernel stack

    rsp 搬到 user space 上做 ROP直接在 kernel 中 ROP • bypass smap • 在 kernel 中先做些事 (ROP, ...),把 cr4 register 寫掉 -> 0x6f0 • 關掉 smep, smap 再 return 回 user space 執⾏或 ROP • kpti • f i x cr3 register - page table • swapgs_restore_regs_and_return_to_usermode()
  48. privilege escalation • ROP • • ret2user -> system("/bin/sh") •

    spawn a root shell! commit_creds(prepare_kernel_cred(0))
  49. privilege escalation • ROP • • ret2user -> system("/bin/sh") •

    spawn a root shell! commit_creds(prepare_kernel_cred(0)) ROOT ☠
  50. modprobe_path • kernel global variable • default path: /sbin/modprobe •

    $ cat /proc/sys/kernel/modprobe • 執⾏⼀個 kernel 認不得的執⾏檔格式時,kernel 會帶 root 權限執⾏這個 path 所定義的檔案。
  51. modprobe_path • sys_execve • do_execve() • do_execveat_common() • bprm_execve() •

    exec_binprm() • search_binary_handler() • request_module() • call_modprobe() • call_usermodehelper_exec() static int search_binary_handler(struct linux_binprm *bprm) { ... if (need_retry) { if (printable(bprm->buf[0]) && printable(bprm->buf[1]) && printable(bprm->buf[2]) && printable(bprm->buf[3])) return retval; if (request_module("binfmt-%04x", *(ushort *)(bprm->buf + 2)) < 0) return retval; need_retry = false; goto retry; } ...
  52. modprobe_path • exploitation • overwrite modprobe_path 成⾃⼰寫得 shell script,/tmp/x •

    $ echo -ne '#!/bin/sh\n/bin/chmod 777 /flag' > /tmp/x • 執⾏格式壞掉的執⾏檔 • $ echo -ne '\\xff\\xff\\xff\\xff' > /tmp/fake • $ chmod +x /tmp/fake && /tmp/fake • kernel 以 root 權限執⾏ /tmp/x • cat / f l ag
  53. modprobe_path • exploitation • overwrite modprobe_path 成⾃⼰寫得 shell script,/tmp/x •

    $ echo -ne '#!/bin/sh\n/bin/chmod 777 /flag' > /tmp/x • 執⾏格式壞掉的執⾏檔 • $ echo -ne '\\xff\\xff\\xff\\xff' > /tmp/fake • $ chmod +x /tmp/fake && /tmp/fake • kernel 以 root 權限執⾏ /tmp/x • cat / f l ag ROOT ☠
  54. userfaultfd • syscall - 323 • 註冊 userfault 記憶體區域,並⾃⾏實作 page

    fault handler • ⾃⾏控制 page fault 處理⾏為 • 當 kernel 中進⾏ copy_from_user() 或 copy_to_user() 時,access user memory 會觸發 page fault,此時可以在 page fault handler 中處理⾏為。
  55. userfaultfd • 應⽤在 exploitation 上的效果 • race condition friendly •

    如在 page fault handler 中 sleep,即可卡住 kernel 中的執⾏流程 • 先執⾏其他操作才完成對 fault 處理,控制執⾏流程先後順序 • 使其穩定發⽣能觸發 race condition 的 scenario,不需撞機率
  56. setxattr • syscall - 188 • 可以在 user space 直接使⽤

    setxattr 來接 kvmalloc() 請求 1 ~ 65536 (0x10000)範圍的⼤⼩,並且將 user land data copy 上去。 • setxattr + userfualtfd 連技 • 常搭配 userfaultfd 串招,來卡住中間執⾏ copy_from_user(),⽤可以 ⾃⾏決定處理好 page fault 的時機來使 setxattr 繼續執⾏,間接控制 kfree() 時機,可以⽤來⽅便串 UAF。
  57. • linux-5.9.12/fs/xattr.c#510 setxattr static long setxattr(struct dentry *d, const char

    __user *name, const void __user *value, size_t size, int flags) { int error; void *kvalue = NULL; char kname[XATTR_NAME_MAX + 1]; if (flags & ~(XATTR_CREATE|XATTR_REPLACE)) return -EINVAL; error = strncpy_from_user(kname, name, sizeof(kname)); if (error == 0 || error == sizeof(kname)) error = -ERANGE; if (error < 0) return error; if (size) { if (size > XATTR_SIZE_MAX) return -E2BIG; kvalue = kvmalloc(size, GFP_KERNEL); if (!kvalue) return -ENOMEM; if (copy_from_user(kvalue, value, size)) { error = -EFAULT; goto out; } if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) || (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) posix_acl_fix_xattr_from_user(kvalue, size); else if (strcmp(kname, XATTR_NAME_CAPS) == 0) { error = cap_convert_nscap(d, &kvalue, size); if (error < 0) goto out; size = error; } } error = vfs_setxattr(d, kname, kvalue, size, flags); out: kvfree(kvalue); return error; }
  58. • linux-5.9.12/fs/xattr.c#510 setxattr static long setxattr(struct dentry *d, const char

    __user *name, const void __user *value, size_t size, int flags) { int error; void *kvalue = NULL; char kname[XATTR_NAME_MAX + 1]; if (flags & ~(XATTR_CREATE|XATTR_REPLACE)) return -EINVAL; error = strncpy_from_user(kname, name, sizeof(kname)); if (error == 0 || error == sizeof(kname)) error = -ERANGE; if (error < 0) return error; if (size) { if (size > XATTR_SIZE_MAX) return -E2BIG; kvalue = kvmalloc(size, GFP_KERNEL); if (!kvalue) return -ENOMEM; if (copy_from_user(kvalue, value, size)) { error = -EFAULT; goto out; } if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) || (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) posix_acl_fix_xattr_from_user(kvalue, size); else if (strcmp(kname, XATTR_NAME_CAPS) == 0) { error = cap_convert_nscap(d, &kvalue, size); if (error < 0) goto out; size = error; } } error = vfs_setxattr(d, kname, kvalue, size, flags); out: kvfree(kvalue); return error; }
  59. setxattr + userfaultfd • 將 userfaultfd 結合⽤來卡住 setxattr 內的 copy_from_user(),使其不會

    直接⾺上 kfree() • 當需要 kfree() chunk 時,於⾃⾏實作的 userfaultfd page felt handler 中 copy mmap 出來準備好的 page 完成 page fault 處理,即可使 setxattr 繼 續執⾏完成 kfree() • 解決 setxattr 可以任意控制 kmalloc() 時機與 size, content,卻無法控制 kfree() 時機的缺點。
  60. setxattr + userfaultfd • 串招後,從 user land 可以直接: • 任意時機調⽤

    kmalloc() • 滿⾜絕⼤多數需求的 size range: 1 ~ 65536(0x10000) • 完全從 user land 控制 chunk 內容 • 搭配 userfaultfd,任意時機選擇 kfree()
  61. msg_msg • • • kmalloc(size+0x30) • 將 msgbuf 內容 copy

    ⾄ chunk + 0x30 處,前 0x30 為其 header • • kfree int qid = msgget(IPC_PRIVATE, 0644 | IPC_CREAT) msgsnd(qid, &msgbuf, real_size - 0x30, 0) msgrcv(qid, &msgbuf, real_size - 0x30, 1, 0)
  62. msg_msg • Pros • ⽅便從 user land 控制 kmalloc() kfree()

    • code 好寫 • 比 setxattr + userfaultfd 更好控制 kfree() 時機,即使⽤ msgrcv() 即可 • Cons • kmalloc 出來的 chunk 前 0x30,不好控