Software components become more larger and complicated. → Many people try to automate analyzing process. In 2016, DARPA hosted Cyber Grand Challenge (CGC) → Almost all winners used Fuzzing technique in the vulnerability detection process.
program with it. Black box Fuzzing (Synopsys defensics, zzuf) “a” “kqeqert” “G\x13\x02” “iohbpofi9qnpiof” “3i129074g” They don’t observe program behavior, learn nothing. They continue dumb generation forever
(input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 6210 8147 297 4010 Instrument unique random numbers to every basic blocks by compiler.
(isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 297 4010 Calculate hash keys based on random numbers on program path. Map hash keys to memory key = nextBB ^ prevBB >> 1
make git clone https://github.com/google/AFL wget https://ftp.gnu.org/gnu/binutils/binutils-2.26.tar.gz tar xf binutils-2.26.tar.gz cd binutils-2.26 CC=/path/to/AFL/afl-gcc ./configure --disable-werror make
(input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 6210 8147 297 4010 Feedback mechanism (edge coverage) If we only have a binary executable... Instrument unique random numbers to every basic blocks by emulator.
Branch Vunlerable Control Flow Graph(CFG) of target program “GET” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”))
“HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) ・・・・・ “kq” “\xfeXp\x2a” Mutation
Scheduler (Selection) Mutation If generated input leads new program coverage, F(input) = 1. Which inputs should be mutated? Generate next inputs by mutatting parent inputs . Grey-box Fuzzer (AFL, libfuzzer) Coverage information + Genetic Algorithm (GA) Generation Mutation Mutation Mutation
-t <time out> -m <memory limit> -o <output dir> -- <command line> @@ cd AFL && make git clone https://github.com/google/AFL ls <output dir>/queue # Show all saved seeds ls <output dir>/crashes # Show crash inputs
Flow Graph(CFG) of target program “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”))
“HTTP1.?” if (input[0] == ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “a” (1) Execute program with initial input “a” (2) Build constraints from trace (input[0] != “G”) & (strcmp(input, HTTP)) & (!isinteger(input)) (3) Negate one of the constraints (input[0] == “G”) & (strcmp(input, HTTP)) & (!isinteger(input)) (4) Solve constraints by SMT solver → next input “G”
(input[2] == ‘T’) “GET” if (input[0] == ‘G’) Next seed “GET” Congratulation! We found 3 program path lead by “G”, “GE” and “GET” inputs. White box Fuzzer (SAGE) It’s also called “Dynamic Symbolic Execution” (DSE)
is hard to overcome the long magic number comparison. if (input == 0xdeadbeefcafebabe) { crash(); } Grey box way White box way Mutation “ 0x0” “0xdeadbeefcafebabe ” P(crash) = 1/(2^64) SMT solve “ 0x0” “0xdeadbeefcafebabe ”
if (input == 0xdeadbeefcafebabe) if (input[0] == ‘A’’) if (input[2] < 10) Nick Stephen Driller: Augmenting Fuzzing Through Selective Symbolic Execution [NDSS16] Grey-box fuzzing Dynamic Symbolic Execution (DSE)
== ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 297 4010 Calculate hash keys based on random numbers on program path. Sergej Schumilo kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels[UESNIX17]
code block to instrument additional instructions, reveal basic block coverage. Antonio Chizpurfle: A Gray-Box Android Fuzzer for Vendor Service Customizations [ISSRE17] Sushant Dinesh RetroWrite: Statically Instrumenting COTS Binaries for Fuzzing and Sanitization[S&P20] They use frida, rewrite Android system services
favorite real world programs. Try to find bug or vulnerabilities from them. You can use any fuzzers. You should try many (program, fuzzer) combinations!