This particular binary was what I spent most of my time on. HackIM used an adaptive scoring engine this year, and this challenge ended up being worth 497 points out of 500 possible. This is a pretty strong scoring challenge, with a total of only 11 solves and I came in 6th on it.
Challenge text:
I opened this in GHIDRA but it crashed. halp pls
Basic info:
obfusc8_much: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, stripped
Giving it a run:
./obfusc8_much test 0
Tried other inputs, no change in output. Running ltrace, we can find a couple notable things:
ltrace ./obfusc8_much
[pid 27946] [0x7f9cf593eec9] SYS_brk(0) = 0x17d8000
[pid 27946] [0x7f9cf59327de] SYS_access("/etc/ld.so.nohwcap", 00) = -2
[pid 27946] [0x7f9cf593fe27] SYS_access("/etc/ld.so.preload", 04) = -2
[pid 27946] [0x7f9cf593fcdd] SYS_openat(0xffffff9c, 0x7f9cf5944428, 0x80000, 0) = 3
[pid 27946] [0x7f9cf593fc43] SYS_fstat(3, 0x7ffd94620e20) = 0
[pid 27946] [0x7f9cf593ff43] SYS_mmap(0, 0x100f7, 1, 2) = 0x7f9cf5b39000
[pid 27946] [0x7f9cf593fed7] SYS_close(3) = 0
[pid 27946] [0x7f9cf593b139] SYS_access("/etc/ld.so.nohwcap", 00) = -2
[pid 27946] [0x7f9cf593fcdd] SYS_openat(0xffffff9c, 0x7f9cf5b4cdd0, 0x80000, 0) = 3
[pid 27946] [0x7f9cf593fda4] SYS_read(3, "\177ELF\002\001\001\003", 832) = 832
[pid 27946] [0x7f9cf593fc43] SYS_fstat(3, 0x7ffd94620e80) = 0
[pid 27946] [0x7f9cf593ff43] SYS_mmap(0, 8192, 3, 34) = 0x7f9cf5b37000
[pid 27946] [0x7f9cf593ff43] SYS_mmap(0, 0x3f0ae0, 5, 2050) = 0x7f9cf5532000
[pid 27946] [0x7f9cf593fff7] SYS_mprotect(0x7f9cf5719000, 2097152, 0) = 0
[pid 27946] [0x7f9cf593ff43] SYS_mmap(0x7f9cf5919000, 0x6000, 3, 2066) = 0x7f9cf5919000
[pid 27946] [0x7f9cf593ff43] SYS_mmap(0x7f9cf591f000, 0x3ae0, 3, 50) = 0x7f9cf591f000
[pid 27946] [0x7f9cf593fed7] SYS_close(3) = 0
[pid 27946] [0x7f9cf5924024] SYS_arch_prctl(4098, 0x7f9cf5b384c0, 0x7f9cf5b38e10, 0x7f9cf5b37998) = 0
[pid 27946] [0x7f9cf593fff7] SYS_mprotect(0x7f9cf5919000, 16384, 1) = 0
[pid 27946] [0x7f9cf593fff7] SYS_mprotect(0x7c2000, 4096, 1) = 0
[pid 27946] [0x7f9cf593fff7] SYS_mprotect(0x7f9cf5b4a000, 4096, 1) = 0
[pid 27946] [0x7f9cf593ffd7] SYS_munmap(0x7f9cf5b39000, 65783) = 0
[pid 27946] [0x4005c9] memset(0x7ffd946217a0, '\0', 71) = 0x7ffd946217a0
[pid 27946] [0x4005de] __isoc99_scanf(0x5c2b74, 0x7ffd946217a0, 71, 0 <unfinished ...>
[pid 27946] [0x7f9cf56417c3] SYS_fstat(0, 0x7ffd945fbbb0) = 0
[pid 27946] [0x7f9cf56484b9] SYS_brk(0) = 0x17d8000
[pid 27946] [0x7f9cf56484b9] SYS_brk(0x17f9000) = 0x17f9000
[pid 27946] [0x7f9cf5642081] SYS_read(0test
, "test\n", 1024) = 5
[pid 27946] [0x4005de] <... __isoc99_scanf resumed> ) = 1
[pid 27946] [0x5c2add] printf("%d", 0 <unfinished ...>
[pid 27946] [0x7f9cf56417c3] SYS_fstat(1, 0x7ffd945f8530) = 0
[pid 27946] [0x5c2add] <... printf resumed> ) = 1
[pid 27946] [0x7f9cf5642154] SYS_write(1, "0", 10) = 1
[pid 27946] [0x7f9cf5642217] SYS_lseek(0, -1, 1) = -29
[pid 27946] [0x7f9cf5616e06] SYS_exit_group(0 <no return ...>
[pid 27946] [0xffffffffffffffff] +++ exited (status 0) +++
Specifically:
- 0x4005c9 == Memset'ing a buffer (getting it ready for use)
- 0x4005de == Scanf'ing info that buffer (you can tell via the use of the same memory address as memset)
- 0x5c2add == Printing our failure. Note it's a %d, so we probably should not be expecting anything other than an int at that point.
At this point, I decided to open it up in radare2 (r2 -A ./obfusc8_much). After seeking over to main, I was greeted with a slew of identified variables. For those unaware, part of radare2 (and most major disassemblers) analysis passes is to identify variables on the stack. It does this by emulating execution (in r2's case, using an intermediate language called ESIL), and watching for stack based reads and writes. When it discovers those, it marks the stack offset at that location as a possible variable. It's not uncommon to have upwards of 20 of those for longer functions. However, for the main function here, r2 discovered 729 local variables (afi~locals).
In fact, the visual control flow graph for this function is so large that IDA Pro refused to render it at all. Radare2, on the other hand, puts on it's "no fear" hat and boldly builds the graph for you. Surprisingly, after about 30 seconds, it does pop up the graph and you can somewhat navigate around. This turns out to be unimportant, however, as the graph is far too large to actually even see or understand. This is because we keep running into things called opaque predicates.
Without getting into the details of opaque predicates too much, the concept is that to the compiler (and disassembler), the action being performed seems like it could go either way, when in reality it will always end up going the same direction. Think of it like when your boss asks you to come in on the weekend. Technically, it looks like you have an option to not come in, but really it's not conditional, it's determined ahead of time.
Here's an example that repeats throughout the binary:
│ ╭─> 0x004007b5 b001 mov al, 1
│ ╎ 0x004007b7 84c0 test al, al
│ ╭──< 0x004007b9 0f8536000000 jne 0x4007f5 ;[3]
│ ╭───< 0x004007bf e900000000 jmp 0x4007c4 ;[4]
In this case, the binary is moving the static value of 1 into a register al. It is the testing if register al is equal to zero. Well of course it's not, we just moved 1 into there! Finally, it's conditionally jumping to 0x4007f5. This type of predicate jump is actually pretty easy for disassemblers to optimize away as it's immediately used after it is set. However, there are more tricks that are far more complicated for compilers and disassemblers to identify and optimize. For more information on the predicates used in this binary, see the authors writeup (link below).
At this point, I briefly attempted to do a basic run of angr through the application with the hope that magic would ensue. Of course it did not, as angr faced the huge state explosion problem and (surprisingly slowly) filled up my RAM. I will return to this in a second, but I took a brief detour at this point.
In my frustration with the predicates, I decided to try to simply stamp them out. While this effort didn't actually have a reasonable affect, it was nice to see just how easy mass binary editing can be with radare2. To do this, I first actually used objdump as it was easiest to find the locations I wanted to kill.
objdump -M intel -d ./obfusc8_much | grep -A 1 "test al,al" | grep jne | cut -f 1 -d ":" | tr -d ' ' | tr '\n' ','
Since the jmp target addresses change, it's super handy to use radare2's assembly patch option. With this, you can write what assembly you want to be stamped in and it will convert to bytes and write the corresponding bytes instead. With this in mind, patching 3600 locations simply becomes about 18 lines of python.
#!/usr/bin/env python
import r2pipe
import json
import shutil
with open('addrs_to_fix.json','r') as f:
addrs_to_fix = json.loads(f.read())
shutil.copy('./dist', './dist.dork')
r2 = r2pipe.open('./dist.dork')
r2.cmd('oo+')
for addr in addrs_to_fix:
print('Fixing ' + hex(addr))
old_inst = r2.cmdj('pij 1 @ {}'.format(addr))
old_jmp = old_inst[0]['jump']
r2.cmd('wa jmp {} @ {}'.format(old_jmp, addr))
Unfortunately, as stated up front, that actually didn't help angr at all, since angr is smart enough to not take the wrong branch. It did, however, change up the control flow graph that radare2 created, but not enough to really help me understand the flow any better.
I also checked for instruction counting leaks. When people write software, often times they want to be efficient and fast. This, as it turns out, can be a serious security flaw as it can allow for an iterative approach to breaking the security. This type of fast-fail has previously been abused in login systems to identify character-by-character the correct password. To check this, I used a core linux tool called perf. If you're unfamiliar with perf, it exposes performance tracking information such as instructions, branches taken, cache hit/miss, etc. It comes in handy for these types of attacks since it does not impose any significant performance cost on the running binary, and unlike valgrind it does not need to even modify the binary.
$ sudo perf stat -e instructions:u ./obfusc8_much
test
0
Performance counter stats for './obfusc8_much':
181,631 instructions:u
0.504198802 seconds time elapsed
$ sudo perf stat -e instructions:u ./obfusc8_much
hest
0
Performance counter stats for './obfusc8_much':
181,952 instructions:u
2.496059931 seconds time elapsed
Note here that 'test' as an input caused 181,631 instructions to be run, while 'hest' caused 181,952. This is a bit of a dirty trick since we happen to know the flag format and that the flag should start with 'hackim19{'. During the CTF, i actually switched back to working with angr at this point, though I should have followed down this path more. After the CTF ended, I wrote up a dynamic solver using perf as above. It gets you close, and likely with some modifications to the script you could figure out the real flag, however instruction counting in this manner isn't perfect.
Circling back to angr, I decided i needed to provide angr more information for it to be able to find the solution. Main issue with how this program appears to be executing is that the 'wrong' answer path still had a bunch of branches after it. To identify a way around this issue, I wanted to see how the binary was storing this success or failure variable. In the ltrace run above, we discovered our print statement at 0x5c2add. Let's look before it to try to identify what variable it's using.
[0x005c2add]> pdi 20 @ 0x5c2add-0x31
0x005c2aac 8bb5b8adfdff mov esi, dword [rbp - 0x25248]
0x005c2ab2 8916 mov dword [rsi], edx
0x005c2ab4 488b85b8adfdff mov rax, qword [rbp - 0x25248]
0x005c2abb 8b08 mov ecx, dword [rax]
0x005c2abd 894dec mov dword [rbp - 0x14], ecx
0x005c2ac0 837dec00 cmp dword [rbp - 0x14], 0
0x005c2ac4 0f94c0 sete al
0x005c2ac7 2401 and al, 1
0x005c2ac9 0fb6f0 movzx esi, al
0x005c2acc 48bf772b5c0000000000 movabs rdi, 0x5c2b77
0x005c2ad6 b000 mov al, 0
0x005c2ad8 e893d9e3ff call sym.imp.printf
0x005c2add 31f6 xor esi, esi
0x005c2adf 8985bcacfdff mov dword [rbp - 0x25344], eax
0x005c2ae5 89f0 mov eax, esi
0x005c2ae7 4889ec mov rsp, rbp
0x005c2aea 5d pop rbp
0x005c2aeb c3 ret
Looking back from the printf statement, we can see the variable is being loaded from rbp-0x14. Let's validate this by looking for where this variable is being accessed from.
[0x00400590]> pdi 0x10000| grep -C 3 -i '\[rbp - 0x14\]'
0x004005a1 c745fc00000000 mov dword [rbp - 4], 0
0x004005a8 897df8 mov dword [rbp - 8], edi
0x004005ab 488975f0 mov qword [rbp - 0x10], rsi
0x004005af c745ec00000000 mov dword [rbp - 0x14], 0
0x004005b6 4889c7 mov rdi, rax
0x004005b9 89ce mov esi, ecx
0x004005bb ba47000000 mov edx, 0x47
--
0x0040a760 488b85f8f1ffff mov rax, qword [rbp - 0xe08]
0x0040a767 833800 cmp dword [rax], 0
0x0040a76a 0f84d0330000 je 0x40db40
0x0040a770 8b45ec mov eax, dword [rbp - 0x14]
0x0040a773 4889e1 mov rcx, rsp
0x0040a776 4883c1f0 add rcx, 0xfffffffffffffff0
0x0040a77a 4889cc mov rsp, rcx
--
0x0040db2c 418911 mov dword [r9], edx
0x0040db2f 488b85b8f1ffff mov rax, qword [rbp - 0xe48]
0x0040db36 8b08 mov ecx, dword [rax]
0x0040db38 894dec mov dword [rbp - 0x14], ecx
0x0040db3b e96a340000 jmp 0x410faa
0x0040db40 8b45ec mov eax, dword [rbp - 0x14]
0x0040db43 4889e1 mov rcx, rsp
0x0040db46 4883c1f0 add rcx, 0xfffffffffffffff0
0x0040db4a 4889cc mov rsp, rcx
--
0x00410f9c 890a mov dword [rdx], ecx
0x00410f9e 488b85d8edffff mov rax, qword [rbp - 0x1228]
0x00410fa5 8b08 mov ecx, dword [rax]
0x00410fa7 894dec mov dword [rbp - 0x14], ecx
0x00410faa 0fbe45a1 movsx eax, byte [rbp - 0x5f]
0x00410fae 89c1 mov ecx, eax
0x00410fb0 0fafc9 imul ecx, ecx
--
<clipped>
We can see there are a bunch of places that end up setting that variable. They use the same pattern of comparing with a location and then setting that variable. Now that we know where it is, we can tell angr that we should explicitly avoid any path that ends up causing that variable to be set to failure. This will allow us to avoid the state explosion that we were seeing before.
In angr, while you normally feed the IP pointer into the find and avoid params of explore, you can also provide it with a callable. When you give it a callable, that function will be fed individual state objects and must return boolean whether that state should be considered found/avoid or not. In our case, it will look something like this:
def avoid(s):
global failed_addr
# Haven't found the stack address yet
if failed_addr == None:
return False
# Avoid states we cannot satisfy
if not s.satisfiable(extra_constraints=(s.mem[failed_addr].dword.resolved == 0,)):
print("Pruning state at: {}".format(s.ip))
return True
return False
The 'failed_addr' we will dynamically discover during execution. Strictly speaking, i believe that address will be static across runs in angr, but for my own sanity I am dynamically discovering the address at angr run time. The meat of it is that second s.satisfiable call, where I ask angr if the failed variable can still be equal to zero. If the variable had been set, this would not be possible anymore and avoid would return true.
That was the major hurdle to overcome here. One other thing is that you may want to attempt to keep your memory use down, as it can still creep up there and crash from using it all. I've found two nice features for this: Spiller and auto_drop.
Spiller is an exploration technique that keeps only given number of states in memory at any point in time. If you get too many states, it will automatically 'spill' them to disk and drop them from memory. This helps to give you a tradeoff of in-memory states (faster) vs states on disk (slower, but you probably have more disk space).
Auto_drop is a nice feature in the SimulationManager that will, as the name implies, automatically drop paths in the given stashes. I have found that the memory management for angr/z3/python combination works better if you drop early, vs dropping the paths later. I have no hard evidence why, so take it for my opinion. Regardless, I found setting this to auto-drop the avoid states (of which there will be many) helped to alleviate memory consumption.
Finally, since I always end up being worried about my memory getting blown out, especially if i leave angr running in the background for a while, I created my own exploration technique that is now merged into angr called MemoryWatcher. MemoryWatcher allows you to set the minimum amount of free memory your system can have before angr stops itself. By default, it sets it to 95% of your system's memory being used. This way, even if the other memory use mitigations aren't enough, MemoryWatcher can help stop execution if you're about to run out of memory. Note, it is by no means a guarantee that it will stop it in time. Specifically, it checks to stop it during the beginning of a simgr step. If the state ends up taking up a significant amount of memory during a single step, then it's quite possible MemoryWatcher will fail. But hey, it's free and you get what you pay for...
When running my solution, it took a few hours on my i7. My angr solution was certainly not the fastest solution, and given the flag, it was not the intended solution. It seems the author assumed that a dynamic approach should be taken. Oh well.
My solution: https://github.com/bannsec/CTF/blob/master/2019/hackim/obfusc8_much_win.py
Author's writeup: https://sudhackar.github.io/blog/writing-an-llvm-pass-for-ctfs