Harekaze 2019: Admin's Product Key

Details: Written by Michael Bann

Description: I forgot admin’s product key…

product_key

First off, this challenge was listed as 200pts. I was the 7th to solve it with about 45 minutes left in the contest. Other 200 point challenges had many more solves (for instance Baby ROP was 200pt and had 80+ solves). Unfortunately, I got sucked into this and needed to finish it for my own sanity. That said, it was good challenge, just not realistically 200 points.

An initial inspection of the binary shows that it's a 64-bit ELF. Running it we find that we're supposed to put the key in as the first argument to the binary:

$ ./product_key
Usage: ./product_key <product key>

We're not really given any more information about the key. For instance, what is the format of this key? I decided to jump right into the code. Here's the Ghidra decompilation of the main function. The first thing you'll probably notice is the following:

if (sVar5 == 0x27)

So we now know that our input needs to be 39 bytes in length. Is all of that the license? Directly below it, you will see this code:

__s = strtok(__s,"-");
pcVar2 = table;
while (__s != (char *)0x0) {
    table = pcVar2;
    __strncat_chk(&local_48,__s,4);
    __s = strtok((char *)0x0,"-");
    pcVar2 = table;
}

This is a bit to unpack. There's a reference to some table symbol. Then, we start looping through the tokenization of our key. Specifically, we're tokenizing on '-', which means our input probably looks like 'something-something-something'. We also have strncat being called, which means we're copying some of the key somewhere else. The hint there is that we're copying up to 4 chars at a time. Put this together and we can guess our key is supposed to look like 'AAAA-AAAA-AAAA-AAAA-AAAA-AAAA-AAAA-AAAA'. Running it again with this as our input we can confirm this is right:

./product_key AAAA-AAAA-AAAA-AAAA-AAAA-AAAA-AAAA-AAAA
invalid product key

Skimming further down, the next thing that caught my attention was this:

do {
inv_table[(long)pcVar2[lVar6]] = (char)lVar6;
lVar6 = lVar6 + 1;
} while ((ulong)(~uVar4 - 2) + 1 != lVar6);

So we had reference to a table previously, and now inv_table. The interesting (and later a bit irritating) thing is that this is being built at run-time. There's really no need for that. It could just as easily be built in at compile time, but i guess the authors wanted to be cute here. Regardless, it was worth keeping in mind that there's both a forward and reverse lookup table, hinting that the key is going through a lookup table transformation.

Next we hit the decode function. The function is taking in a pointer to my consolidated key buffer as well as a pointer to nothing in particular. Ghidra tells us that the return value from this function is not being used, therefore either it's a dud, or the function is going to be changing the state of the program when run. Also, given it's taking in a otherwise unused pointer that appears to be used after, it's a good bet that memory pointed to by this pointer will be updated by the function.

To be honest, I spent a while trying to understand this code manually. Obvious references to the inversion table, and taking characters physically next to each other and doing things. I was able to confirm that it was building an output (or 'decoding' i guess) into the second pointer that was passed in. After a while I just got irritated and bored with it. For reference, here's the Ghidra decompilation.

Since I knew enough about the program at this time, I went ahead and gave angr a run at it. Effectively just using r2 and symbion to jump angr's execution after the strok was taken care of. However, angr just basically state exploded on me so i killed it shortly after.

Seeing as I didn't want to look at decoded anymore, i started looking at what happens after decoded. The next few lines made more sense when stepping through with a debugger:

iVar3 = 0;
piVar7 = &local_68;
do {
    piVar8 = (int *)((long)piVar7 + 1);
    iVar3 = (int)*(char *)piVar7 + iVar3 * 0x1f;
    piVar7 = piVar8;
} while (&local_58 != piVar8);
if (local_58 == iVar3)

So stepping through this in a debugger, we begin to understand what's going on. That buffer that the decode function filled up is 20 bytes long. This loop is performing some form of integrity check on the first 16 bytes of the decoded output, and comparing what it calculated with the last 4 bytes of your decoded input. This means that we will not only have to get the right user, but we will also have to make sure the checksum works and calculates correctly.

Finally, we run into the last check.

if (local_58 == iVar3) {
    __printf_chk(1,"Hello, %.16s!\n",&local_68);
    uVar4 = strncmp((char *)&local_68,"i-am-misakiakeno",0x10);
    uVar9 = (ulong)uVar4;
    if (uVar4 == 0) {
        __printf_chk(1,"You are admin! The flag is: HarekazeCTF{%s}\n",&local_48);
    }
    else {
        uVar9 = 0;
        puts("You are not admin.");
    }

This part is at least strait forward. We're being told that it's possible to validate but validate as the wrong user. We have to ensure that our key is for 'i-am-misakiakeno'. At this point, I didn't feel like statically attacking that decode anymore. However, it was time to dynamically attack it. Time to break out radare2.

Out of all the things to use r2 for, it's python bindings are pretty helpful. Of course, it requires you're already familiar with r2, but if you are it can make creating dynamic and even code driven static harnesses a lot easier. In this case, I wanted to try different inputs into decode and see if there was any glaring weaknesses. I wrote some harness functions in python that allowed me to try input and only grab the result of decode. I did this by programmatically setting a breakpoint at the end of decode and pulling the buffer out of memory.

If you're interested in the code, i've posted it here.

After trying different input, I quickly noticed that changes in one area were not propagating to other areas. Only those characters within 2-3 (or later I'd discover 4) spaces of each other affected each other. This means that I could effectively brute force only a couple characters at a time, making the discovery of the key more reasonable. Before I could start on that, I still needed to know the proper checksum to put at the end of the user 'i-am-misakiakeno'.

I considered reversing how the checksum works, but i already had a radare2 python framework up, so it was just easier to do basically the same thing i did was the decode function, but this time with the checksum. In my python file, you'll see it called "calculate_checksum". It's pretty strait forward, simply setting a breakpoint right before the checksum, writing into memory what I want decoded to be, and then pulling out of eax what the correct checksum is supposed to be.

With all this information, i was now able to start attempting to brute force what the key was. Initially, I tried writing the brute forcer using radare2 commands. Unfortunately, while it would work, it would take FAR too long. I attempted to keep the program running and simply jump back to the start of decode each time (saving on process startup time), but that ended up segfaulting. Remembering how Frida had saved the day previously in situations like these where speed is important, I started developing a solution for Frida.

I'm still fairly novice with Frida, but I was able to write a nested for loop that would try to force 3 characters (and at one point needed 4) at a time. The nice thing with Frida is it will generally be faster than python or other such solution since it's executing inside the binary itself. It also is stable and generally doesn't cause issues with the binary. One problem I ran into with this, however, was that inverse table. Remember that from above? That table is built at run time. Initially, I wanted to just pull the table out of memory and write it into the binary, but that sounded like too much work. Because the table gets set up in main, there's no good way to simply call that as a function prior to brute forcing. My solution was quick and dirty since it was for a CTF. Effectively, I just wrote a jump command into the binary after the binary setup it's inverse table. The jump command just went back one instruction, and so it effectively stalled the program (and pegged my CPU). However, this meant that the binary was running and waiting to be hooked into by Frida, with all the decoding tables and everything ready to go.

Since my JavaScript foo isn't very good, I basically used just enough JavaScript to have Frida print out the correct key three characters at a time. Starting at the very end of the key and working towards the beginning. Each cluster of 3-4 characters probably took about 1-2 minutes on average. While not super fast, it was fast enough to break the key.

If you haven't played with Frida, you really should check it out. This just scratches the surface of things it can do. Here's my frida script.

$ ./product_key H6AA-NWCH-HK7V-0JII-HU4A-A3IQ-HBWT-\V514
Hello, i-am-misakiakeno!
You are admin! The flag is: HarekazeCTF{H6AANWCHHK7V0JIIHU4AA3IQHBWTV514}