Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Defeating APT10 Compiler-level Obfuscations

Defeating APT10 Compiler-level Obfuscations

2019, Virus Bulletin, REcon, hack.lu, AVAR
https://www.virusbulletin.com/blog/2020/03/vb2019-paper-defeating-apt10-compiler-level-obfuscations/
https://www.youtube.com/watch?v=0DvRAP9VhJA (REcon)
https://www.youtube.com/watch?v=e_uLcgHRs1Y (hack.lu)
https://github.com/vmware-archive/HexRaysDeob

Compiler-level obfuscations, like opaque predicates and control flow flattening, are starting to be observed in the wild and are likely to become a challenge for malware analysts and researchers. Opaque predicates and control flow flattening are obfuscation methods that are used to limit malware analysis by defining unused logic, performing needless calculations, and altering code flow so that it is not linear. Manual analysis of malware utilizing these obfuscations is painful and time-consuming.

ANEL (also referred to as UpperCut) is a RAT used by APT10, typically targeting Japan. All recent ANEL samples are obfuscated with opaque predicates and control flow flattening. In this presentation I will explain how to de-obfuscate the ANEL code automatically by modifying the existing IDA Pro plugin HexRaysDeob.

Specifically, the following topics will be included:
- Disassembler tool internals (IDA Pro IL microcode)
- How to define and track opaque predicate patterns for their elimination
- How to break control flow flattening while considering various conditional/unconditional jump cases even if it depends heavily on the opaque predicate conditions and has multiple switch dispatchers.

The modified tool is available publicly and this implementation has been found to deobfuscate approximately 92% of encountered functions in the tested sample. Additionally, most of the failed functions can be properly deobfuscated in IDA Pro 7.3. This provides researchers with an approach with which to attack such obfuscations, which could be adopted by other families and other threat groups.

Takahiro Haruyama

October 01, 2019
Tweet

More Decks by Takahiro Haruyama

Other Decks in Technology

Transcript

  1. Who am I? • Takahiro Haruyama (@cci_forensics) –Principal Threat Researcher

    – Carbon Black’s Threat Analysis Unit (TAU) –Reverse-engineering cyber espionage malware – linked to PRC/Russia/DPRK –Past public research presentations – binary diffing, Winnti/PlugX malware research – forensic software exploitation, memory forensics Virus Bulletin 2019 2
  2. Overview • Motivation and Approach • Microcode • Opaque Predicates

    • Control Flow Flattening • IDA 7.2 Issues and 7.3 Improvements • Wrap-up Virus Bulletin 2019 3
  3. APT10 ANEL [1][2] • RAT program used by APT10 –observed

    in Japan uniquely • ANEL version 5.3.0 or later are obfuscated with –opaque predicates –control flow flattening Virus Bulletin 2019 7
  4. Motivation and Approach • automate ANEL code de-obfuscations –The obfuscations

    looked similar to the ones described in Hex-Rays blog [3] –The IDA plugin HexRaysDeob [4] didn’t work – It was made for another variant of the obfuscations –I investigated the causes then modified HexRaysDeob to work for ANEL samples [8] Virus Bulletin 2019 9
  5. Microcode • intermediate representation (IR) used by IDA Pro decompiler

    • optimized in 9 maturity levels –transformed from low-level to high-level IRs [3] Virus Bulletin 2019 11 low high
  6. minsn_t Key Structures [5] Virus Bulletin 2019 14 mbl_array_t mblock_t

    mblock_t mblock_t ..... minsn_t minsn_t minsn_t ..... mop_t (left) HexRaysDeob installs two optimizer callbacks: optblock_t and optinsn_t mop_t (right) mop_t (dest)
  7. CFG and Instructions in Microcode Explorer Virus Bulletin 2019 15

    CFG (mblock_t) nested instructions (minsn_t) top-level instruction sub instructions block number
  8. Opaque Predicates Summary • optinsn_t::func replaces an opaque predicate pattern

    with another expression –called from MMAT_ZERO to MMAT_GLBOPT2 • ANEL samples require 2 more patterns and data- flow tracking Virus Bulletin 2019 17
  9. Pattern1: ~(x * (x - 1)) | -2 • In

    the example below, – dword_745BB58C = either even or odd – dword_745BB58C * (dword_745BB58C - 1) = always even – the lowest bit of the negated value becomes 1 – OR by -2 (0xFFFFFFFE) will always produce the value -1 • The pattern x * (x-1) will be replaced with 2 Virus Bulletin 2019 18
  10. Pattern2: read-only global variable >= 10 or < 10 •

    dword_72DBB588 is always 0 – without a value (will be initialized with 0) – only read accesses • the pattern matching function replaces the global variable with 0 • other variants – the variable - 10 < 0 – the immediate value can be different, not 10 (e.g., 9) Virus Bulletin 2019 19
  11. Data-flow tracking for the patterns • trace back the minsn_t

    / mblock_t linked lists Virus Bulletin 2019 20 = x * (x - 1) ?
  12. Data-flow tracking for the patterns (Cont.) • optinsn_t::func passes a

    null mblock_t pointer if an instruction is not top-level –An additional code traces from jnz then passes the pointer to setl Virus Bulletin 2019 21 = read-only global variable ?
  13. Control Flow Flattening: block comparison variable Virus Bulletin 2019 24

    block comparison variable assignment block comparison variable comparison The unflattening code translates block comparison variables into block numbers (mblock_t::serial)
  14. Control Flow Flattening: Modifications • three main modifications –Unflattening in

    multiple maturity levels –Control flow handling with multiple dispatchers –Implementation for various jump cases Virus Bulletin 2019 25
  15. Unflattening in Multiple Maturity Levels • The original implementation works

    in MMAT_LOCOPT – due to "Odd Stack Manipulations” obfuscation • I had to unflatten the ANEL code in later maturity levels – The block comparison variable heavily depends on opaque predicate conditions Virus Bulletin 2019 26
  16. Unflattening in Multiple Maturity Levels (Cont.) • The loop becomes

    simpler once opaque predicates are broken • Unflattening in later maturity levels makes another problem Virus Bulletin 2019 27 In MMAT_LOCOPT, The block comparison variable 0x4624F47C is translated into block #9
  17. Unflattening in Multiple Maturity Levels (Cont.) • The block will

    be eliminated in later maturity levels • The modified code – Links between block comparison variables and block addresses in MMAT_LOCOPT – Guesses the block numbers in later maturity levels by using each block and instruction addresses Virus Bulletin 2019 28
  18. Control Flow Handling with Multiple Dispatchers • The original implementation

    assumes an obfuscated function has only one control flow dispatcher • Some functions in the ANEL sample have multiple dispatchers –up to seven dispatchers in one function Virus Bulletin 2019 29
  19. Control Flow Handling with Multiple Dispatchers (Cont.) • The modified

    code –catches the hxe_prealloc event then calls the optblock_t::func – This event occurs several times in MMAT_GLBOPT1 and MMAT_GLBOPT2 –utilizes different algorithms – control flow dispatcher / first block detection – block comparison variable validation Virus Bulletin 2019 30
  20. Control Flow Handling with Multiple Dispatchers (Cont.) • The modified

    code detects block comparison variable duplications and applies the most likely variable Virus Bulletin 2019 31
  21. Implementation for Various Jump Cases: The Originals Virus Bulletin 2019

    32 flattened block(s) (dispatcher predecessor) from conditional block (1) goto case for normal block to control flow dispatcher (2) conditional jump case for flattened if-statement block dispatcher predecessor nonJcc endsWithJCC false true flattened blocks
  22. Implementation for Various Jump Cases: The Additions Virus Bulletin 2019

    34 (3) goto N predecessors case (4) (2)+(3) combination case dispatcher predecessor pred 0 pred 1 pred N ... dispatcher predecessor pred 0 pred 1 pred N ... nonJcc endsWith JCC false true
  23. Implementation for Various Jump Cases: The Additions (Cont.) • (5)

    Block comparison variables are assigned in the first blocks – The modified code reconnects first blocks as successors of the flattened block • I saw up to three assignments of the case in one function Virus Bulletin 2019 37 block #1 will be the successor of block #7
  24. Evaluation on IDA 7.2 • Tested ANEL samples –5.4.1 payload

    [1] – 3d2b3c9f50ed36bef90139e6dd250f140c373664984b97a97a5 a70333387d18d –5.5.0 rev1 loader DLL [6] – f333358850d641653ea2d6b58b921870125af1fe77268a6fdfed a3e7e0fb636d • The modified tool could deobfuscate 92% of the obfuscated functions that we encountered in the 5.4.1 payload Virus Bulletin 2019 39
  25. Evaluation on IDA 7.2 (Cont.) • The causes of the

    failures –The next block number guessing algorithm failed –Propagations of opaque predicates deobfuscation failed –No method to handle a conditional jump of a dispatcher predecessor with multiple predecessors Virus Bulletin 2019 40 resolved in IDA 7.3 resolved in this case
  26. IDA 7.3: Propagation of Opaque Predicates Deobfuscation Virus Bulletin 2019

    41 aliased stack slots always 0xC1A18C30 (signed) 7.2 7.3
  27. IDA7.3: Handling a Conditional Jump of a Dispatcher Predecessor •

    All jump cases (1)-(5) can be conditional –(2)-(4) cases require a mblock_t duplication • IDA 7.3 provides the option –clear the flag MBA2_NO_DUP_CALLS –use mbl_array_t::insert_block API then copy instructions and other information –adjust destinations of the blocks passing a control to the exit block whose block type is BLT_STOP Virus Bulletin 2019 42
  28. Conditional Jump Case (4) Virus Bulletin 2019 45 not seen

    in the tested samples :-) preds can be conditional too
  29. Workaround in Control Flow Unflattening Failure • The plugin execution

    with 0xdead deobfuscates only opaque predicates in the current selected function Virus Bulletin 2019 46 idc.load_and_run_plugin("HexRaysDeob", 0xdead) idc.load_and_run_plugin("HexRaysDeob", 0xf001)
  30. Wrap-up • The compiler-level obfuscations are starting to be observed

    in the wild –The automated deobfuscation is needed • The modified code is available publically [7] –1570 insertions(+), 450 deletions(-) –It works for almost every obfuscated function of APT10 ANEL on IDA 7.3 Virus Bulletin 2019 48
  31. Acknowledgement • Hex-Rays • Rolf Rolles • TAU members –especially

    Jared Myers and Brian Baskin Virus Bulletin 2019 49
  32. References • [1] https://www.fireeye.com/blog/threat-research/2018/09/apt10-targeting- japanese-corporations-using-updated-ttps.html • [2] https://jsac.jpcert.or.jp/archive/2019/pdf/JSAC2019_6_tamada_jp.pdf • [3]

    http://www.hexblog.com/?p=1248 • [4] https://github.com/RolfRolles/HexRaysDeob • [5] https://www.hexblog.com/?p=1232 • [6] https://www.secureworks.jp/resources/at-bronze-riverside-updates-anel- malware • [7] https://github.com/carbonblack/HexRaysDeob • [8] https://www.carbonblack.com/2019/02/25/defeating-compiler-level- obfuscations-used-in-apt10-malware/ Virus Bulletin 2019 50
  33. Questions? • [Q1] What’s the obfuscating compiler? –[A1] Not sure

    but it may be Obfuscator-LLVM • [Q2] This tool works for other samples with similar obfuscations? –[A2] Yes only if – Q1 is resolved – the compiler algorithm and implementation have been thoroughly investigated Virus Bulletin 2019 51