Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The XZ Backdoor Story

Thomas Roccia
September 08, 2024

The XZ Backdoor Story

This talk was presented at Defcon 32.

https://defcon.org/html/defcon-32/dc-32-speakers.html

On Fri, 29 Mar 2024, at exactly 08:51:26, OSS security received a message from Andres Freund, a software engineer at Microsoft, stating he had discovered a backdoor in upstream xz/liblzma that could compromise SSH servers. The open-source project XZ, specifically the liblzma library, has been compromised by a mysterious maintainer named Jia Tan, putting the entire internet at risk. Fortunately, this discovery helped us avoid the worst.

But what happened? How long has this rogue maintainer been part of the project? Who is Jia Tan? Was he involved in other projects? How does the backdoor work? And what should we learn from this?

These are questions we will attempt to answer. First, we will discuss the discovery, which is so riddled with coincidences and chance that it's hard not to think about all the ones we've missed. Then, we'll examine the process itself, from gaining trust within the project to deploying the backdoor, dissecting the operating methods and the main protagonists. We will also dive into the technical details, explaining how the backdoor is deployed and how it can be exploited.

The XZ backdoor is not just an incredible undercover operation but also a gigantic puzzle to solve. Beyond the technical background, there is a story to tell here, to capitalize on what went wrong and what we could improve.

Thomas Roccia

September 08, 2024
Tweet

More Decks by Thomas Roccia

Other Decks in Technology

Transcript

  1. How the backdoor was found The long-term operation and timeline

    The technical details and assumptions What it means for the security industry What we will cover
  2. Date: Fri, 29 Mar 2024 08:51:26 -0700 From: Andres Freund

    <[email protected]> To: [email protected] Subject: backdoor in upstream xz/liblzma leading to ssh server compromise
  3. SSH logins consuming substantial CPU 500 ms 500ms delay identified

    with new package CPU time traced to liblzma (part of XZ package) Without the package With the package
  4. A Bunch of Coincidences "Saw sshd processes were using a

    surprising amount of CPU, despite immediately failing because of wrong usernames etc." "Profiled sshd, showing lots of cpu time in liblzma, with perf unable to attribute it to a symbol." "I chose to make all of them use -fno-omit-frame-pointer." "Valgrind would not have complained about the payload without -fno-omit-frame-pointer." "Additionally, I chose to use Debian unstable to find possible portability problems earlier." "Without having seen the odd complaints in valgrind, I don't think I would have looked deeply enough when seeing the high cpu in sshd below getcpuid()."
  5. Utils Package Free software for lossless data compression (lzma, xz)

    maintened by Lasse Collin Backdoor found in liblzma (xz 5.6.0, 5.6.1), introduced by user "Jia Tan", Feb 2024 Affected mostly development versions, not widely deployed Assigned CVE-2024-3094
  6. Why Targeting XZ? XZ Utils is widely used in many

    Linux distributions and open-source projects for compression. XZ Utils has been a trusted component in the open-source ecosystem for years. OpenSSH does not directly use liblzma, but liblzma is linked to sshd via libsystemd.
  7. Timeline Initial involvement of Jia Tan October 29, 2021 -

    June 29, 2022 1 2 4 3 5 Transition of maintainership SEPTEMBER 27, 2022 - MARCH 18, 2023 Preparation for the attack MARCH 20, 2023 - JANUARY 19, 2024 Backdoor insertion and distribution FEBRUARY 23, 2024 - MARCH 28, 2024 Discovery and response MARCH 28, 2024 - MARCH 30, 2024 2021-10-29: Jia Tan sends first patch to the xz-devel mailing list. 2022-04-22/2022-06-22: Multiple pressure emails from Jigar Kumar and Dennis Ens for changing the main maintener of XZ. 2022-06-29: Lasse Collin mentions Jia Tan as a co-maintainer already. 2022-09-27: Jia Tan gives release summary for 5.4.0 version. 2022-10-28: Jia Tan added to the Tukaani organization on GitHub. 2022-11-30: Lasse Collin adds Jia Tan in the bug report email. 2022-12-30: Jia Tan merges first batch of commits directly into the xz repo. 2023-03-20: Jia Tan updates Google oss-fuzz to send bugs to his email. 2023-06-22: Hans Jansen sends patches for GNU indirect function feature. 2023-07-07: Jia Tan disables ifunc support during oss-fuzz builds. 2024-01-19: Jia Tan moves the project website to GitHub pages. 2024-02-24: Jia Tan merges hidden backdoor binary code in test files. 2024-02-24: Jia Tan tags and builds v5.6.0 with the backdoor. 2024-02-28: Jia Tan breaks landlock detection. 2024-03-09: Jia Tan commits updated backdoor (Valgrind fix) files and tags v5.6.1. 2024-03-28: Andres Freund discovers the bug and privately notifies Debian and distros@openwall. 2024-03-29: Andres Freund posts backdoor warning to oss- security@openwall list. Internet is on fire! 2021 2024
  8. 2022-05-27 - Jigar Kumar 2022-04-28 - Jigar Kumar 2022-05-19 -

    Dennis Ens 2022-05-19 - Lasse Collin 2022-06-07 - Jigar Kumar Exploiting the human weakness
  9. 2022-06-14 - Jigar Kumar 2022-06-22 - Jigar Kumar 2022-06-29 -

    Lasse Collin 2022-06-08 - Lasse Collin 2022-06-21 - Dennis Ens Exploiting the human weakness
  10. The Setup Commit 2 binary blob by JiaT45 - tests/files/bad-3-corrupt_lzma2.xz

    tests/files/good-large_compressed.lzma 24 Feb 2024 Tag and built xz V5.6.0 with added file ‘build-to-host.m4’ in tarball that deliver the backdoor when building the package.
  11. 86 F9 5A F7 2E 68 6A BC Manipulate input

    data stream Stage 3 and backdoor extraction Stage 2 - Injected Script
  12. Decompress the second malicious (good-large_compressed.lzma) file that contains stage 3

    and the binary backdoor Stage 2 - Injected Script Remove junk data & unused bytes Custom substitution cipher Deciphered data are decompressed Stage 3 Script
  13. Stage 3 - Setup for compilation Check IFUNC Support (remember

    previous commit ) Verifies CRC source files (crc64_fast.c and crc32_fast.c) for required IFUNC code. Multiple Checks Ensures x86-64 Linux system. Confirms shared library support. Validates GCC compiler and GNU ld. Checks for the previous two tests files. Ensures Position Independent Code (PIC) build. Extracting the Backdoor from good-large_compressed.lzma Embeds the backdoor into object files liblzma_la-crc64-fast.o. Modifies is_arch_extension_supported function in crc_x86_clmul.h: Replaces __get_cpuid with _get_cpuid, removing one underscore.
  14. Decompress good-large_compressed.lzma m=256: Defines the modulus value as 256. t

    array: Creates a translation table mapping characters to their corresponding ASCII values. c array: Initializes the c array with values transformed by a simple linear formula. Permutes the c array using a method similar to the key scheduling algorithm (KSA) in RC4, where elements of c are swapped based on their values. Each character from the input is transformed using the permuted c array, similar to how RC4 generates pseudo-random bytes (PRGA) and uses them to decrypt the data. Decompress the decrypted part, parse the data and store the output into liblzma_la-crc64-fast.o Stage 3 - Backdoor Extraction
  15. frfrgtg frfrgtg frfrgtg Malicious liblzma_la-crc64-fast.o is incorporated into compiled liblzma.

    Compilation Process ./configure build-to-host.m4 is executed and subsequent scripts Make Run all the checks Create the Makefile if previous conditions are met backup original file .libs/liblzma_la-crc64_fast.o S pecific Compilation Flags: -Wl, now, -z: ifunc resolvers run at startup, and gets the backdoor called during this. Mod ification of files crc64_fast.c and crc32_fast.c. Compilation, Linking Stage Manipulations, Cleanup
  16. The Backdoor - IFUNC IFUNC allows for dynamic resolution of

    function implementations at runtime. Backdoor hijacks the ifunc resolver to allow the modification of GOT and PLT.
  17. dl_audit is a feature in dynamic linkers that notifies custom

    libraries when events like symbol resolution occur. The backdoor uses this to bypass RELRO and to intercept and redirect function calls to its malicious code. Check loaded target process (sshd) and its shared libraries (liblzma, libcrypto) via the dynamic linker (ld). Set up the dl_audit structure and activate the audit mechanism. During dynamic linking, resolve and bind symbols (function addresses). The _dl_audit_symbind function is called by the linker whenever a symbol needs binding. If conditions are met, _dl_audit_symbind calls the symbind function, set to the backdoor's install_hook function. The Backdoor - dl_audit
  18. The Backdoor - GOT Redirect The Global Offset Table (GOT)

    is a section of a program's memory used to enable computer program code compiled as an ELF file to run correctly It maps symbols to their corresponding absolute memory addresses to facilitate Position Independent Code (PIC) Within the GOT, symbol offsets, including the cpuid wrapper, are stored. At runtime, the backdoor manipulates these pointers, redirecting calls to the main malware function, making it appear as though cpuid is being called.
  19. The Backdoor - Strings Obfuscation A trie is a data

    structure that stores strings as keys. Instead of comparing strings directly, the backdoor uses trie lookups. Matches return constant numbers 0x1d0 for RSA_public_decrypt 0x300 for ELF header
  20. The Backdoor - Hooked Functions Retrieve obfuscated strings from trie.

    RSA_public_decrypt & RSA_get0_key are used in the latest SSH server version and are called when an RSA certificate is configured as the SSH authentication method. EVP_PKEY_set1_RSA not used in latest SSH version.
  21. Others Checks & Evasion Verifies current process: /usr/bin/sshd Check for

    potential kill switch: yolAbejyiejuvnup=Evjtgvsh5okmkAvj Abuses legit lzma_alloc for memory allocation Hooks logging function to hide unauthorized SSH connections Includes several other advanced functionalities (encryptions, anti-replay, tiny embedded disassembler...)
  22. The backdoor embeds 3 main functions to be trigger. The

    command is calculated with magic numbers (magic1 * magic2) + magic3 = CMD SSH Auth Bypass Root login (cmd 0 ‘mm_answer_keyallowed’ and 1 ‘mm_answer_authpassword’) RCE via systemd (cmd 2) Closes the pre-auth session (cmd 3) How does it works really? 0a 31 fd 3b 2f 1f c6 92 92 68 32 52 c8 c1 ac 28 34 d1 f2 c9 75 c4 76 5e b1 f6 88 58 88 93 3e 48 10 0c b0 6c 3a be 14 ee 89 55 d2 45 00 c7 7f 6e 20 d3 2c 60 2b 2c 6d 31 00 Hardcoded ED448 public key for signature validation and decrypting the payload. liblzma loaded SSH Auth Bypass Remote Code Execution The backdoor is activated by connecting with an SSH certificate containing a payload in the Certificate Authority signing key's modulus (N) value. Close Session CMD=0/1 CMD=2 CMD=3
  23. HEADER EXPONENT E IV (A, B, C) 4 bytes for

    magic1 (uint32) 4 bytes for magic2 (uint32) 8 bytes for magic3 (uint64) CMD Pub Key SHA256 for ED448 Signature 0a 31 fd 3b 2f 1f c6 92 92 68 32 52 c8 c1 ac 28 34 d1 f2 c9 75 c4 76 5e b1 f6 88 58 88 93 3e 48 10 0c b0 6c 3a be 14 ee 89 55 d2 45 00 c7 7f 6e 20 d3 2c 60 2b 2c 6d 31 00 uint32(0x1234 * 0x568) + uint64(0xfffffffff9d9ffa2) = 2 (RCE Command) Flag 5 Bytes Certificate Payload Malicious RSA Key used for SSH auth Backdoored SSH Server Initiate SSH Connection Get Server Public Key Attacker ED448 public Key hidden in the backdoor MODULUS N PAYLOAD Decrypt the payload with ChaCha20 and the first 32 bit of the ED448 key + IV from the payload Verify Payload Signature which is generated with the CMD value, the 5 bytes flag, the payload and the SHA256 SIGNATURE PAYLOAD
  24. xz/liblzma v5.6.0 xz/liblzma v5.6.1 Bytes Comment 86 F9 5A F7

    2E 68 6A BC Bytes Comment E5 55 89 B7 24 04 D8 17 X OS version check ‘[ ! $(uname) = "Linux" ] && exit 0’ X Extension mechanism? Signature ‘filename:offset:sig’ Unused variables presents in stage 3 Unused variables presents in stage 3 Build-to-host.m4 Differences Versionning? Debugging? Modularity? Targetting specific env?
  25. One day before the first email sent on 2022-04-28 Who

    is behind? Jia (Cheong) Tan JiaT75 | [email protected] Jigar Kumar [email protected] Dennis Ens [email protected] Hans jansen [email protected] krygorin4545 [email protected] [email protected] Email to Debian bug 2024-03-26
  26. What can we do? Strict Contributor Verification Implement stringent verification

    processes for contributors, especially for those with maintainer privileges. Ensure builds are reproducible The same source code produces the same binary every time. Ensure that the built binaries (tarballs) match the source code in the repository. Audit Hooks and RELRO Protections Review and restrict the use of audit hooks and dynamic linking features. Improve Dependency Management Regularly audit dependencies, especially those maintained by hobbyists critical to infrastructure.
  27. What does it mean for the security community and beyond?

    The entire industry relies on open-source tools, often maintained by hobbyists. The discovery of the XZ backdoor was a mix of luck, coincidence, and expertise. It redefines what we call a sophisticated attacker. There is no easy solution or protection, except knowledge.
  28. https://www.openwall.com/lists/oss-security/2024/03/29/4 https://tukaani.org/xz-backdoor/ https://gynvael.coldwind.pl/?lang=en&id=782 https://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504 https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowjkx2njy2b https://github.com/amlweems/xzbot https://research.swtch.com/xz-script https://github.com/0xlane/xz-cve-2024-3094 https://www.wiz.io/blog/cve-2024-3094-critical-rce-vulnerability-found-in-xz-utils https://isc.sans.edu/diary/The+amazingly+scary+xz+sshd+backdoor/30802

    https://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504 https://medium.com/@knownsec404team/techniques-learned-from-the-xz-backdoor-74b0a8d45c30 https://gist.github.com/q3k/af3d93b6a1f399de28fe194add452d01 https://lwn.net/Articles/967192/ https://research.meekolab.com/dissecting-the-xz-utils-backdoor https://securelist.com/xz-backdoor-story-part-1/112354/ https://securelist.com/xz-backdoor-story-part-2-social-engineering/112476/ https://securelist.com/xz-backdoor-part-3-hooking-ssh/113007/ https://github.com/blasty/JiaTansSSHAgent https://github.com/binarly-io/binary-risk-intelligence/tree/master/xz-backdoor https://isc.sans.edu/diary/The+amazingly+scary+xz+sshd+backdoor/30802 https://boehs.org/node/everything-i-know-about-the-xz-backdoor https://github.com/karcherm/xz-malware https://x.com/bl4sty/status/1776727891910299888 https://www.wiz.io/blog/cve-2024-3094-critical-rce-vulnerability-found-in-xz-utils https://github.com/ald3ns/xz-backdoor-github-analysis https://x.com/fr0gger_/status/1774342248437813525 https://x.com/fr0gger_/status/1775759514249445565 Acknowledgments @AndresFreundTec @gynvael smx-smx @FiloSottile @amlweems Russ Cox 0xlane @AmitaiCo @bl4sty Sam James Lasse Collin @birchb0y And all the others I forget... @nugxperience