Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using LLVM for malware deobfuscation

Using LLVM for malware deobfuscation

Avatar for Yuma Kurogome

Yuma Kurogome

January 09, 2014
Tweet

More Decks by Yuma Kurogome

Other Decks in Programming

Transcript

  1. 1 Title WIP Presentation Using LLVM for malware deobfuscation B1

    Yuma Kurogome(@ntddk) a.k.a. gomachan Supervisor: none
  2. 3 Background ▪Analysis of malware is becoming difficult  APT

     Botnet  Code obfuscation etc... ▪Many obfuscation tools/methods ▪No good deobfuscation tool available
  3. 4 Purpose ▪Realization of useful deobfuscator  Use code optimizer

    of LLVM  Implementation of x86 Frontend ➔ It is difficult to make AST from x86 native code x86 Frontend x86
  4. 5 Related work OptiCode: Machine Code Deobfuscation for Malware Analysis,

    Nguyen Anh Quynh, Presentation, SysCan SG, Apr 2013 ▪Support many obfuscation technics  Insert dead instruction  Insert NOP semantic instructions  Insert unreachable code  Insert branch insn to next insn ▪Own x86 frontend(details unknown) and default LLVM optimizer  Generate control flow graph(CFG) consisting of basic blocks(BB) from machine code  Constant folding  Eliminate dead store instrucitons  Combine instrctions  Simplifly CFG  Merge BB In this work, I wanted to reproduce the OptiCode
  5. 6 Related work Dynamically Translating x86 to LLVM using QEMU,

    Vitaly Chipounov, George Candea, 2010 ▪QEMU has Dynamic translator(now Tiny code generator)  Target code → IR → host code  Disassembler  Micro-Operations  Mapping ▪Use LLVM Code Dictionary instead of Host Code Dictionary  Reffered when mapping
  6. 8 Implementation ▪Modify QEMU Dynamic Translator  Tiny code generator(tcg)

    ➔ BB  Easy to mapping register of LLVM IR  Generate CFG from LLVMContext class ▪Use LLVM optimizer  Insert dead code ➔ -dse, -simplifycfg  Substitute with equivalent instructions ➔ -constprop, -instcombie  Reorder instructions ➔ -instcombie
  7. 9 Problem ▪Methods written in Opticode can be deobfuscated 

    Without opaque predicate However, ▪QEMU Dynamic translator has problems  Dependence on context  Impossible to interpret Win32API  Overhead ▪Optimice is more sophisticated than my work  Deobfuscation plugin for IDA  Use CFG and BB generated from IDA  Overcome the problem of my work ▪Evaluation method is ambiguous...
  8. 10 Future work ▪Continuation of research for TERM  How

    can we deobfuscate malware? ▪Establishment of evaluation method ▪Leading in semantics  Abstract lnterpretation  Predicate logic  There is little existing reserch...