Gist with XZ Backdoor analysis
quality 7/10 · good
0 net
[WIP] XZ Backdoor Analysis and symbol mapping · GitHub Skip to content --> Search Gists Search Gists Sign in Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} Instantly share code, notes, and snippets. smx-smx / XZ Backdoor Analysis Last active February 24, 2026 09:30 Show Gist options Download ZIP Star 221 ( 221 ) You must be signed in to star a gist Fork 10 ( 10 ) You must be signed in to fork a gist Embed Select an option Embed Embed this gist in your website. Share Copy sharable link for this gist. Clone via HTTPS Clone using the web URL. No results found Learn more about clone URLs Clone this repository at <script src="https://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504.js"></script> Save smx-smx/a6112d54777845d389bd7126d6e9f504 to your computer and use it in GitHub Desktop. Embed Select an option Embed Embed this gist in your website. Share Copy sharable link for this gist. Clone via HTTPS Clone using the web URL. No results found Learn more about clone URLs Clone this repository at <script src="https://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504.js"></script> Save smx-smx/a6112d54777845d389bd7126d6e9f504 to your computer and use it in GitHub Desktop. Download ZIP [WIP] XZ Backdoor Analysis and symbol mapping Raw backdoor_analysis.md Discord Room for discussion https://discord.com/invite/maFYmgQkYH Github repository: https://github.com/smx-smx/xzre Init routines Llzma_delta_props_decoder -> backdoor_ctx_save Llzma_block_param_encoder_0 -> backdoor_init Llzma_delta_props_encoder -> backdoor_init_stage2 Prefix Trie ( https://social.hackerspace.pl/@q3k/112184695043115759 ) Llzip_decode_1 -> table1 Lcrc64_clmul_1 -> table2 Llz_stream_decode -> count_1_bits Lsimple_coder_update_0 -> table_get Retrieves the index of the encoded string given the plaintext string in memory Lcrc_init_0 -> import_lookup .Lcrc64_generic.0 -> import_lookup_ex Anti RE and x64 code Dasm Llzma_block_buffer_encode_0 -> check_software_breakpoint Lx86_code_part_0 -> code_dasm Llzma_index_iter_rewind_cold -> check_return_address Checks if the return address has been tampered with. This function is called at the beginning of a "protected" function. If the check fails, the function returns early without doing anything Llzma_delta_decoder_init_part_0 -> backdoor_vtbl_init It sets up a vtable with core functions used by the backdoor Lstream_decoder_memconfig_part_1 -> get_lzma_allocator Llzma_simple_props_encode_1 -> j_tls_get_addr Llzma_block_uncomp_encode_0 -> rodata_ptr_offset Llzma12_coder_1 -> global_ctx ELF parsing Llzma_filter_decoder_is_supported.part.0 -> parse_elf_invoke Lmicrolzma_encoder_init_1 -> parse_elf_init Lget_literal_price_part_0 -> parse_elf Llzma_stream_header_encode_part_0 -> get_ehdr_address Lparse_bcj_0 -> process_elf_seg Llzma_simple_props_size_part_0 -> is_gnu_relro Stealthy ELF magic verification // locate elf header while ( 1 ) { if ( ( unsigned int ) table_get ( ehdr , 0LL ) == STR__ELF ) // 0x300 break ; // found ehdr -= 64 ; // backtrack and try again if ( ehdr == start_pointer ) goto not_found ; } Llzma_stream_flags_compare_1 -> get_rodata_ptr Verified or Suspected function hooking Llzma_index_memusage_0 -> apply_entries Llzma_check_init_part_0 -> apply_one_entry Lrc_read_init_part_0 -> apply_one_entry_internal Llzma_lzma_optimum_fast_0 -> install_entries Llzip_decoder_memconfig_part_0 -> installed_func_0 Llzma_index_prealloc_0 -> RSA_public_decrypt GOT hook/detour Llzma_index_stream_size_1 -> check_special_rsa_key -> (thanks q3k ) Called from Llzma_index_prealloc_0 , it checks if the supplied RSA key is the special key to bypass the normal authentication flow Lindex_decode_1 -> installed_func_2 Lindex_encode_1 -> installed_func_3 Llzma2_decoder_end_1 -> apply_one_entry_ex Llzma2_encoder_init.1 -> apply_method_1 Llzma_memlimit_get_1 -> apply_method_2 lzma allocator / call hiding Lstream_decoder_mt_end_0 -> get_lzma_allocator_addr Linit_pric_table_part_1 -> fake_lzma_allocator Lstream_decode_1 -> fake_lzma_free core functionality Llzma_delta_props_encode_part_0 -> resolve_imports (including system() ) Llzma_index_stream_flags_0 -> process_shared_libraries Reads the list of loaded libraries through _r_debug->r_map , and calls process_shared_libraries_map to traverse it Llzma_index_encoder_init_1 -> process_shared_libraries_map Traverses the list of loaded libraries, looking for specific libraries func @0x7620 : It does indirect calls on the vtable configured by backdoor_vtbl_init , and is called by the RSA_public_decrypt hook (func#1) upon certain conditions are met Software Breakpoint check, method 1 This method checks if the instruction endbr64 , which is always present at the beginning of every function in the malware, is overwritten. GDB would typically do this when inserting a software breakpoint /*** address: 0xAB0 ***/ __int64 check_software_breakpoint ( _DWORD * code_addr , __int64 a2 , int a3 ) { unsigned int v4 ; v4 = 0 ; // [for a3=0xe230], true when *v = 0xfa1e0ff3 (aka endbr64) if ( a2 - code_addr > 3 ) return * code_addr + ( a3 | 0x5E20000 ) == 0xF223 ; // 5E2E230 return v4 ; } Function backdoor_init (0xA784) __int64 backdoor_init ( rootkit_ctx * ctx , DWORD * prev_got_ptr ) { _DWORD * v2 ; __int64 runtime_offset ; bool is_cpuid_got_zero ; void * cpuid_got_ptr ; __int64 got_value ; _QWORD * cpuid_got_ptr_1 ; ctx -> self = ctx ; // store data before overwrite backdoor_ctx_save ( ctx ); ctx -> prev_got_ptr = ctx -> got_ptr ; runtime_offset = ctx -> head - ctx -> self ; ctx -> runtime_offset = runtime_offset ; is_cpuid_got_zero = ( char * ) * ( & Llzma_block_buffer_decode_0 + 1 ) + runtime_offset == 0LL ; cpuid_got_ptr = ( char * ) * ( & Llzma_block_buffer_decode_0 + 1 ) + runtime_offset ; ctx -> got_ptr = cpuid_got_ptr ; if ( ! is_cpuid_got_zero ) { cpuid_got_ptr_1 = cpuid_got_ptr ; got_value = * ( QWORD * ) cpuid_got_ptr ; // replace with Llzma_delta_props_encoder (backdoor_init_stage2) * ( QWORD * ) cpuid_got_ptr = ( char * ) * ( & Llzma_block_buffer_decode_0 + 2 ) + runtime_offset ; // this calls Llzma_delta_props_encoder due to the GOT overwrite runtime_offset = cpuid (( unsigned int ) ctx , prev_got_ptr , cpuid_got_ptr , & Llzma_block_buffer_decode_0 , v2 ); // restore original * cpuid_got_ptr_1 = got_value ; } return runtime_offset ; } Function Name matching (function 0x28C0) str_id = table_get ( a6 , 0LL ); ... if ( str_id == STR_RSA_public_decrypt_ && v11 ) ... else if ( v13 && str_id == STR_EVP_PKEY_set__RSA_ ) ... else if ( str_id != STR_RSA_get__key_ || ! v17 ) Hidden calls (via lzma_alloc ) lzma_alloc has the following prototype: extern void * lzma_alloc ( size_t size , const lzma_allocator * allocator ) The malware implements a custom allocator, which is obtained from get_lzma_allocator @ 0x4050 void * get_lzma_allocator () { return get_lzma_allocator_addr () + 8 ; } char * get_lzma_allocator_addr () { unsigned int i ; char * mem ; // Llookup_filter_part_0 holds the relative offset of `_Ldecoder_1` - 180h (0xC930) // by adding 0x180, it gets to 0xCAB0 (Lx86_coder_destroy), Since the caller adds +8, we get to 0xCAB8, which is the lzma_allocator itself mem = ( char * ) Llookup_filter_part_0 ; for ( i = 0 ; i <= 0xB ; ++ i ) mem += 32 ; return mem ; } The interface for lzma_allocator can be viewed for example here: https://github.com/frida/xz/blob/e70f5800ab5001c9509d374dbf3e7e6b866c43fe/src/liblzma/api/lzma/base.h#L378-L440 Therefore, the allocator is Linit_pric_table_part_1 and free is Lstream_decode_1 NOTE: the function used for alloc is very likely import_lookup_ex , which turns lzma_alloc into an import resolution function. this is used a lot in resolve_imports , e.g.: system_func = lzma_alloc ( STR_system_ , lzma_allocator ); ctx -> system = system_func ; if ( system_func ) ++ ctx -> num_imports ; shutdown_func = lzma_alloc ( STR_shutdown_ , lzma_allocator ); ctx -> shutdown = shutdown_func ; if ( shutdown_func ) ++ ctx -> num_imports ; The third lzma_allocator field, opaque , is abused to pass information about the loaded ELF file to the "fake allocator" function. This is highlighted quite well by function Llzma_index_buffer_encode_0 : __int64 Llzma_index_buffer_encode_0 ( Elf64_Ehdr * * p_elf , struct_elf_info * elf_info , struct_ctx * ctx ) { _QWORD * lzma_allocator ; __int64 result ; __int64 fn_read ; __int64 fn_errno_location ; lzma_allocator = get_lzma_allocator (); result = parse_elf ( * p_elf , elf_info ); // reads elf into elf_info if ( ( _DWORD ) result ) { lzma_allocator [ 2 ] = elf_info ; // set opaque field to the parsed elf info fn_read = lzma_alloc ( STR_read_ , lzma_allocator ); ctx -> fn_read = fn_read ; if ( fn_read ) ++ ctx -> num_imports ; fn_errno_location = lzma_alloc ( STR___errno_location_ , lzma_allocator ); ctx -> fn_errno_location = fn_errno_location ; if ( fn_errno_location ) ++ ctx -> num_imports ; return ctx -> num_imports == 2 ; // true if we found both imports } return result ; } Note how, instead of size , the malware passes an EncodedStringID instead Dynamic analysis Analyzing the initialization routine Replace the endbr64 in get_cpuid with a jmp . ("\xeb\xfe") root@debian: ~ # cat /usr/lib/x86_64-linux-gnu/liblzma.so.5.6.1 > liblzma.so.5.6.1 root@debian: ~ # perl -pe 's/\xF3\x0F\x1E\xFA\x55\x48\x89\xF5\x4C\x89\xCE/\xEB\xFE\x90\x90\x55\x48\x89\xF5\x4C\x89\xCE/g' -i liblzma.so.5.6.1 Force sshd to use the modified library with LD_PRELOAD # env -i LC_LANG=C LD_PRELOAD=$PWD/liblzma.so.5.6.1 /usr/sbin/sshd -h NOTE: anarazel recommends using LD_LIBRARY_PATH with a symlink instead, since LD_PRELOAD changes the initialization order and could interfere with the normal flow of the malware 2b. or use this gdbinit file to do it all at once # cat gdbinit set confirm off unset env # # comment this out if you don't want to debug the initialization code # # (or use LD_LIBRARY_PATH instead) set env LD_PRELOAD=/root/sshd/liblzma.so.5.6.1 set env LANG=C file /usr/sbin/sshd # # start sshd on port 2022 set args -p 2022 set disassembly-flavor intel set confirm on set startup-with-shell off show env show args # gdb -x gdbinit (gdb) r Starting program: /usr/sbin/sshd -p 222 ^C < -- send CTRL-C Program received signal SIGINT, Interrupt. 0x00007ffff7f8a7f0 in ?? () Attach to the frozen process with your favourite debugger ( gdb attach pid ) (gdb) bt #0 0x00007f8cb3b067f0 in ?? () from /root/sshd/liblzma.so.5.6.1 #1 0x00007f8cb3b08c29 in lzma_crc32 () from /root/sshd/liblzma.so.5.6.1 #2 0x00007f8cb3b4ffab in elf_machine_rela (skip_ifunc=, reloc_addr_arg=0x7f8cb3b3dda0 , version=, sym=0x7f8cb3b03018, reloc=0x7f8cb3b04fc8, scope=0x7f8cb3b3f4f8, map=0x7f8cb3b3f170) at ../sysdeps/x86_64/dl-machine.h:300 #3 elf_dynamic_do_Rela (skip_ifunc=, lazy=, nrelative=, relsize=, reladdr=, scope=, map=0x7f8cb3b3f170) at ./elf/do-rel.h:147 #4 _dl_relocate_object (l=l@entry=0x7f8cb3b3f170, scope=, reloc_mode=, consider_profiling=, consider_profiling@entry=0) at ./elf/dl-reloc.c:301 #5 0x00007f8cb3b5e6e9 in dl_main (phdr=, phnum=, user_entry=, auxv=) at ./elf/rtld.c:2318 #6 0x00007f8cb3b5af0f in _dl_sysdep_start ( start_argptr=start_argptr@entry=0x7ffe17e402e0, dl_main=dl_main@entry=0x7f8cb3b5c900 ) at ../sysdeps/unix/sysv/linux/dl-sysdep.c:140 #7 0x00007f8cb3b5c60c in _dl_start_final (arg=0x7ffe17e402e0) at ./elf/rtld.c:498 #8 _dl_start (arg=0x7ffe17e402e0) at ./elf/rtld.c:585 #9 0x00007f8cb3b5b4d8 in _start () from /lib64/ld-li nux-x86-64.so.2 #10 0x0000000000000002 in ?? () #11 0x00007ffe17e40fa1 in ?? () #12 0x00007ffe17e40fb0 in ?? () #13 0x0000000000000000 in ?? () NOTE: _get_cpuid will call function 0xA710, whose purpose is to detect if we're at the right point to initialize the backdoor Why? Because elf_machine_rela will call _get_cpuid for both lzma_crc32 and lzma_crc64 . Since the modified code is part of lzma_crc64 , 0xA710 has a simple call counter in it to trace how many times it has been called, and make sure the modification doesn't trigger for lzma_crc32 . first call (0): -> lzma_crc32 second call (1): -> lzma_crc64 if ( call_counter == 1 ) { /** NOTE: some of these fields are unverified and guessed **/ rootkit_ctx . head = 1LL ; memset ( & rootkit_ctx . runtime_offset , 0 , 32 ); rootkit_ctx . prev_got_ptr = prev_got_ptr ; backdoor_init ( & rootkit_ctx , prev_got_ptr ); // replace cpuid got entry } ++ call_counter ; cpuid ( a1 , & v5 , & v6 , & v7 , & rootkit_ctx ); return v5 ; } At this point, you can issue detach and attach with other debuggers if needed. Once attached, set relevant breakpoints and restore the original bytes ("\xF3\x0F\x1E\xFA") breakpoint on RSA_public_decrypt hook Run this gdb script on the sshd listener process (this new gdbinit script should account for eventual differences in library load address - it didn't happen for me in the first tests but it did later on) set pagination off set follow-fork-mode child catch load # now we forked, wait for lzma catch load liblzma c # now we have lzma # 0x12750: offset from base hbreak *(lzma_crc32 - 0x2640 + 0x12750) set disassembly-flavor intel set pagination on c Now connect via https://gist.github.com/keeganryan/a6c22e1045e67c17e88a606dfdf95ae4 ... Thread 3.1 "sshd" hit Breakpoint 1, 0x00007ffff73d1d00 in ?? () from /lib/x86_64-linux-gnu/liblzma.so.5 (gdb) bt #0 0x00007ffff73d1d00 in ?? () from /lib/x86_64-linux-gnu/liblzma.so.5 #1 0x00007ffff73d1ae7 in ?? () from /lib/x86_64-linux-gnu/liblzma.so.5 <-- Llzma_index_prealloc_0 (offset 0x48 in vtable) #2 0x00005555556bdd00 in ?? () #3 0x0000000100000004 in ?? () #4 0x00007fffffffdeb0 in ?? () #5 0x00000001f74b5d7a in ?? () #6 0x0000000000000000 in ?? () RSA_public_decrypt GOT hook (Llzma_index_prealloc_0) /** the following happens during pubkey login **/ params [ 0 ] = 1 ; // should we call original? // this call checks if the supplied RSA key is special result = installed_func_1 ( rsa_key , global_ctx , params ); // if still 1, the payload didn't trigger, call the original function // if 0, bypass validation if ( params [ 0 ] ) return real_RSA_public_decrypt ( flen , from , to , rsa_key ); return result ; Binary patch for sshd to disable seccomp and chroot (allows Frida tracing of [net] processes) > fc /b sshd sshd_patched Comparing files sshd sshd_patched 0001332A: 75 90 0001332B: 6D 90 ---- 0004FC24: 41 C3 0004FC25: 54 90 ---- 00109010: 01 00 0001332A: changes the following JMP to not be taken: https://github.com/openssh/openssh-portable/blob/43e7c1c07cf6aae7f4394ca8ae91a3efc46514e2/sshd.c#L448-L449 0004FC24: changes the ssh_sandbox_child function to be a no-op: https://github.com/openssh/openssh-portable/blob/43e7c1c07cf6aae7f4394ca8ae91a3efc46514e2/sandbox-seccomp-filter.c#L490 00109010: changes the default value of privsep_chroot from 1 to 0 (probably redundant, since it gets overwritten) Raw XZ Backdoor Analysis This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters XZ Backdoor symbol deobfuscation. Updated as i make progress Copy link Copy Markdown q3k commented Mar 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page . Llzma_index_stream_size_1 / installed_func_1 is the RSA_public_decrypt hook/detour. --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown lockness-Ko commented Mar 31, 2024 Currently working on making a honeypot to detect who is poking this backdoor! Your analysis has been amazingly helpful! If you haven't already, take a look at bpftrace's uprobes and uretprobes for hooking functions. https://github.com/bpftrace/bpftrace/blob/master/man/adoc/bpftrace.adoc#probes https://github.com/lockness-Ko/xz-vulnerable-honeypot --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown Klotzi111 commented Mar 31, 2024 I have not looked in the binary myself yet. I am wondering are the symbol names actually backdoor_ctx_save and backdoor_init ? Or are these name from you? Because if those are the original names: Why would he call them backdoor? Thats not very clever. Could have been easily found by dumping all symbols. --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown Trimester6 commented Mar 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page . Currently working on making a honeypot to detect who is poking this backdoor! It's easy, CN and SG, simple as, they will route to SG if CN is geofenced --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown Author smx-smx commented Mar 31, 2024 I have not looked in the binary myself yet. I am wondering are the symbol names actually backdoor_ctx_save and backdoor_init ? Or are these name from you? Because if those are the original names: Why would he call them backdoor? Thats not very clever. Could have been easily found by dumping all symbols. Nobody has the original names (if they do, they are in Jia Tan 's crew). I am reconstructing/guessing them by looking at the disassembled code (statically/dynamically). --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown NuLL3rr0r commented Mar 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page . No doubt the malicious actor is smart enough to steal someone else's identity. --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown JPeisach commented Mar 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page . Taking a look now, thanks for documenting this so I can find a place to start. .Llzma_index_init.0 seems to be the thing seeing if installed function 1 is not there yet. - proposed name: apply_rsa_decrypt? void . Llzma_index_init . 0 ( long param_1 , undefined8 param_2 , undefined8 param_3 , undefined8 param_4 ) { code * UNRECOVERED_JUMPTABLE ; undefined4 local_1c ; if ((( global_ctx != ( rootkit_ctx * ) 0x0 ) && ( global_ctx -> runtime_offset != 0 )) && ( UNRECOVERED_JUMPTABLE = * ( code * * )( global_ctx -> runtime_offset + 0x10 ), UNRECOVERED_JUMPTABLE != ( code * ) 0x0 )) { if ( param_1 != 0 ) { RSA_public_decrypt ( param_1 ,( long ) global_ctx , & local_1c ); } /* WARNING: Could not recover jumptable at 0x0010a3b4. Too many branches */ /* WARNING: Treating indirect jump as call */ ( * UNRECOVERED_JUMPTABLE )( param_1 , param_2 , param_3 , param_4 ); return ; } return ; } --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown JPeisach commented Mar 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page . .text.parse_optiona , at 00107050, and in fake function .Llzma12_mode_map.part1 , before calling table_get in a while loop, it calls what I've named table_get_first_empty_index: long table_get_first_empty_index ( char * key ) { long index ; if ( * key != '\0' ) { index = 0 ; do { index = index + 1 ; } while ( key [ index ] != '\0' ); return index ; } return 0 ; } .Llzma_index_iter_rewind.cold just applies an entry ("internal") uint . Llzma_index_iter_rewind . cold ( uint param_1 , uint param_2 , uint param_3 , uint param_4 ) { undefined8 uVar1 ; int * unaff_retaddr ; uVar1 = apply_one_entry_internal ( 0 , unaff_retaddr , param_1 , param_2 , param_3 ); return ( uint ) uVar1 | param_4 ; } --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown oogali commented Mar 31, 2024 _get_cpuid is designed to look similar to __get_cpuid . The latter takes 5 arguments, as seen at https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/cpuid.h#L67 . The former, which comes from the injected object, file takes 6 arguments, with the latter being an address of a function frame offset. static inline bool _is_arch_extension_supported(void) { int success = 1; uint32_t r[4]; success = _get_cpuid(1, &r[0], &r[1], &r[2], &r[3], ((char*) __builtin_frame_address(0))-16); const uint32_t ecx_mask = (1 << 1) | (1 << 9) | (1 << 19); return success && (r[2] & ecx_mask) == ecx_mask; } --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown luvletter2333 commented Mar 31, 2024 @smx-smx Hi, I am also working on this and maybe there is an error in your report. void *get_lzma_allocator() { return get_lzma_allocator_addr() + 8; } char *get_lzma_allocator_addr() { unsigned int i; // [rsp+1Ch] [rbp-Ch] __int64 v2; // [rsp+20h] [rbp-8h] // Llookup_filter_part_0 holds the relative offset of `_Ldecoder_1` - 180h (0xC930) // by adding 0x160, it gets to 0xCA90 (Lx86_coder_destroy), which is subsequently used as scratch space // for creating the `lzma_allocator` struct (data starts after 8 bytes, at 0xCA98, which is the beginning of a .data segment) v2 = (__int64)Llookup_filter_part_0; for ( i = 0; i <= 11; ++i ) v2 += 32LL; return (char *)v2; } Actually the loop runs 12 times, so the offset is 12*32 = 0x180. So this function simply returns the address of _Ldecoder_1 (0xCAF0) a.rel.ro.decoders0:000000000000CAF0 _Ldecoder_1 The lzma_allocator is at 0xCAF8, so: alloc is Linit_pric_table_part_1 free is _Lstream_decode_1 --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown oogali commented Mar 31, 2024 The symbol names extracted from the object file that are not present in the actual project: _get_cpuid crc_init init_pric_table lzma_block_encoder_update That comes from me manually walking through the symbol reference output of bingrep. for sym in $(bingrep trojan.o | grep SHT | grep '\.text\.' | grep -v rela | awk '{ print $2 }' | sed 's/\.text\.//; s/.$//' | sort | uniq) ; do echo "==> ${sym}" grep -r "${sym}" . echo done | less --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown sivizius commented Mar 31, 2024 Did you mean 0xA794 (5.6.1, not sure where it is located in 5.6.0) instead of 0xA7849 ? --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown Author smx-smx commented Mar 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page . @luvletter2333 @smx-smx Hi, I am also working on this and maybe there is an error in your report. void *get_lzma_allocator() { return get_lzma_allocator_addr() + 8; } char *get_lzma_allocator_addr() { unsigned int i; // [rsp+1Ch] [rbp-Ch] __int64 v2; // [rsp+20h] [rbp-8h] // Llookup_filter_part_0 holds the relative offset of `_Ldecoder_1` - 180h (0xC930) // by adding 0x160, it gets to 0xCA90 (Lx86_coder_destroy), which is subsequently used as scratch space // for creating the `lzma_allocator` struct (data starts after 8 bytes, at 0xCA98, which is the beginning of a .data segment) v2 = (__int64)Llookup_filter_part_0; for ( i = 0; i <= 11; ++i ) v2 += 32LL; return (char *)v2; } Actually the loop runs 12 times, so the offset is 12*32 = 0x180. So this function simply returns the address of _Ldecoder_1 (0xCAF0) a.rel.ro.decoders0:000000000000CAF0 _Ldecoder_1 The lzma_allocator is at 0xCAF8, so: alloc is Linit_pric_table_part_1 free is _Lstream_decode_1 Oops, you're right, i was off by one. That's what happens when you've been staring at the thing for too long 🙂 . Thanks for the correction --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown Author smx-smx commented Mar 31, 2024 Did you mean 0xA794 (5.6.1, not sure where it is located in 5.6.0) instead of 0xA7849 ? 0xA784, the '9' was spurious. Fixed --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown anarazel commented Mar 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page . Force sshd to use the modified library with LD_PRELOAD env -i LC_LANG=C LD_PRELOAD=$PWD/liblzma.so.5.6.1 /usr/sbin/sshd -h FWIW, I may have observed some minor behavioral differences between the normal initialization order and when liblzma.so.5.6.1 is loaded earlier due to LD_PRELOAD. Probably just noise, but may be worth to instead use LD_LIBRARY_PATH. Requires a symlink from liblzma.so.5 to to the modified liblzma.so.5.6.1 in the directory with the modified liblzma, of course. --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown tux3 commented Mar 31, 2024 To remove the env var and argv0 check, patch function 0x3A10 (in the 5.6.1.o) to always return 1 E.g. an EB F7 on top of xor eax, eax does the trick. (Note I work directly on a liblzma.so instead of the crc64_fast object, so I'm not 100% sure on the function offsets, but it's the function that does cmp eax, 0x0108 ) --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown sivizius commented Mar 31, 2024 Could you somehow keep track of re-renamings of labels or state the offsets? 👉 👈 --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown badactress commented Apr 1, 2024 .text.parse_optiona , at 00107050, and in fake function .Llzma12_mode_map.part1 , before calling table_get in a while loop, it calls what I've named table_get_first_empty_index: That's a name based on the context? Otherwise the behaviour looks like strlen() . long table_get_first_empty_index ( char * key ) { long index ; if ( * key != '\0' ) { index = 0 ; do { index = index + 1 ; } while ( key [ index ] != '\0' ); return index ; } return 0 ; } --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown goabout2 commented Apr 1, 2024 the Dynamic analysis should in debian? --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown Author smx-smx commented Apr 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page . the Dynamic analysis should in debian? Yes, everything was done on Debian Testing --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown luvletter2333 commented Apr 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page . I post my analysis at https://github.com/luvletter2333/xz-backdoor-analysis . My analysis is currently limited to the structure and prototype recovering and has not reached the hook part. --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown fkirc commented Apr 2, 2024 Thank you for your analysis. I have a question regarding the write-execute memory protection: I understand that xz was mapped into the address-space of sshd. However, I still don’t understand how it is possible to overwrite a function of sshd. Until now, I thought that all modern Linux-systems are enforcing a write-xor-execute policy: https://en.m.wikipedia.org/wiki/W%5EX This means that every page of memory could be writable or executable, but not both at the same time. So how on earth it is possible that some malicious library is overwriting code of sshd? I thought that every attempt to overwrite code of sshd should lead to an immediate segfault due to the write-xor-execute-policy of the operating system? --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown PluMGMK commented Apr 2, 2024 So how on earth it is possible that some malicious library is overwriting code of sshd? I thought that every attempt to overwrite code of sshd should lead to an immediate segfault due to the write-xor-execute-policy of the operating system? It doesn't overwrite code per se, just function pointers. The pointers themselves aren't executable code, they just provide the address of the executable code. As for why the function pointers aren't read-only, it's because this attack works at the ifunc-resolving stage, where glibc has got.plt mapped read-write so that it can replace the pointers to ifunc resolvers with the corresponding resolved function pointers (at least, if I've understood this correctly…). --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown anarazel commented Apr 2, 2024 So how on earth it is possible that some malicious library is overwriting code of sshd? I thought that every attempt to overwrite code of sshd should lead to an immediate segfault due to the write-xor-execute-policy of the operating system? It doesn't overwrite code per se, just function pointers. The pointers themselves aren't executable code, they just provide the address of the executable code. As for why the function pointers aren't read-only, it's because this attack works at the ifunc-resolving stage, where glibc has got.plt mapped read-write so that it can replace the pointers to ifunc resolvers with the corresponding resolved function pointers (at least, if I've understood this correctly…). It doesn't even overwrite them during the ifunc resolution itself - it can't, because when liblzma is loaded the main binary isn't yet mapped. And they wanted to redirect the main binary's use of RSA_public_decrypt, and the GOT for that is mapped alongside the main binary. That's why they install the dl-audit hook, it gets called later when the relocations in the main binary are being processed. --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown fkirc commented Apr 3, 2024 So how on earth it is possible that some malicious library is overwriting code of sshd? I thought that every attempt to overwrite code of sshd should lead to an immediate segfault due to the write-xor-execute-policy of the operating system? It doesn't overwrite code per se, just function pointers. The pointers themselves aren't executable code, they just provide the address of the executable code. As for why the function pointers aren't read-only, it's because this attack works at the ifunc-resolving stage, where glibc has got.plt mapped read-write so that it can replace the pointers to ifunc resolvers with the corresponding resolved function pointers (at least, if I've understood this correctly…). It doesn't even overwrite them during the ifunc resolution itself - it can't, because when liblzma is loaded the main binary isn't yet mapped. And they wanted to redirect the main binary's use of RSA_public_decrypt, and the GOT for that is mapped alongside the main binary. That's why they install the dl-audit hook, it gets called later when the relocations in the main binary are being processed. Can you elaborate more about what is a “dl-audit hook“ and how is it possible to overwrite function pointers in the GOT? How does it even find the GOT of sshd if ASLR is enabled? I thought that every library has its own GOT and doesn’t know where the GOTs of other libraries are located in the virtual memory? --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown step21 commented Apr 3, 2024 AFAIK pointers are not ‘overwritten’. This ifunc resolution is ordinarily used to select different functions f e based on the architecture it is run on. So this is a glibc functionality and it doesn’t need to disable aslr or similar. --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Copy link Copy Markdown Xiphoseer commented Apr 10, 2024 https://discord.gg/TSD7H8Ww Is expired --> Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page . Sign up for free to join this conversation on GitHub . Already have an account? Sign in to comment You can’t perform that action at this time.