OpenCTF 2016: neophyte

Details: Written by Michael Bann

This reversing challenge is a good example of how you can solve a problem a few different ways. I initially solved this challenge symbolically (which i believe is the easiest way, actually). However, the challenge can also be solved dynamically which is what the authors intended. I will go over both solutions here.

$ file neophyte_reversing_ccabcc8f0b9900638a75017f2d6dc029 
neophyte_reversing_ccabcc8f0b9900638a75017f2d6dc029: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.24, BuildID[sha1]=f382dd94583c7310bc8b3dd538e9e604f5a6ee38, stripped

Another x86 binary. This time it's stripped which means we won't get to see the developer's function names. Let's run it:

$ ./neophyte_reversing_ccabcc8f0b9900638a75017f2d6dc029 
Usage:    
 ./neophyte_reversing <41 byte flag>

If we are to believe the help output, the flag is 41 bytes and is the first command line argument. Let's give it a shot:

$ ./neophyte_reversing_ccabcc8f0b9900638a75017f2d6dc029 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
no
$ ./neophyte_reversing_ccabcc8f0b9900638a75017f2d6dc029 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
size differs 40 vs 41!
no

Looks legit. If we give it 41 chars it says no. 40 chars it tells us our length is off. Let's check out the control flow.

Starting at the top, we see the check for if the user put in a flag at all (compare argc with 1). It then creates some space, sets initial variables, and starts up a loop. This is a key part to notice. Let's walk through it.

0x80492ba -- Here we're at the conditional part of the loop. This is roughly equivalent to "if index <= 40". Note that our key is 41 and indexes start at 0, so here we're looping through every character of our input one at a time.

0x804927e -- The program is doing some sort of analysis at this point. I didn't dive into this part too much actually. However, this block makes a call to a function, then changes execution based on if the output is true or false. If false, if sets the "bIsCorrect" variable to False. Then it increments on to the next character.

0x80492c1 -- AFTER performing the check to see if your input is correct, it checks the length of your input. Kinda a strange place to put that code, but regardless that's where it will tell you if your length is incorrect.

0x8049306 -- Here we see the comparison to check if your input was correct. Note in 0x8049267, we set this variable to True, then set it to false later if we deem the input to actually be wrong. 0x804931b == no, 0x804930d == yes.

At this point, the main unknown is how the character check function works. However, we don't actually need to understand it. This is because we get feedback immediately after that function call, and the key check is incremental. This means we can actually just brute force it. Let's look at the symbolic approach first.

Symbolic Approach

This is the approach I took initially. As usual, my tool of choice for symbolic execution is angr. To be honest, the script is so simple there's not much to say about it here. The main thing is that we create a symbolic buffer for the input, and tell angr to find our winning location while avoiding the location we know to be failure (where it sets success variable to false).

import angr
import simuvex
import claripy

# Open up an angr project on this file
proj = angr.Project("./neophyte_reversing_ccabcc8f0b9900638a75017f2d6dc029")

# Using claripy to generate a symbolic buffer. We know this needs to be 41
# bytes in length
arg1 = claripy.BVS('arg1',8*41)

# Create the argv array which is just the binary name and our arg
args = ["./neophyte_reversing_ccabcc8f0b9900638a75017f2d6dc029",arg1]

# Create our start state. I remove lazy solves because i like less state
# splitting. You could likely run with without that option removed. Give the
# new state our args we created
state = proj.factory.entry_state(remove_options={simuvex.o.LAZY_SOLVES},args=args)

# Create the path group for exploration
pg = proj.factory.path_group(state)

# Explore. Find the puts("yes") location. Avoid the locations where we know we
# have failed (such as setting the success variable to False)
pg.explore(find=0x804930d,avoid=(0x804931b,0x80492ad))

# Print out our winning flag.
print("Flag: {0}".format(pg.found[0].state.se.any_str(arg1)))

It really is that easy with angr. Running it gives us the flag:

$ python exploit.py 
Flag: OpenCF{IhopeYOUdidThisInADebuggerScript}

Dynamic Approach

As you could see in the flag, the author intended this to be done by a debugger script. The symbolic approach is actually easier and faster, but let's do the dynamic one for kicks. For this, I initially tried to use radare2, however I ran into some bugs that caused me not to be able to complete it. I switched to using gdb scripting instead.

The general idea is this:

Set breakpoint at our failure location (0x80492ad)
Run the binary with some initial input (maybe all "A" values)
If we hit a failure, check what our index was at failure (esp+0x18) and increment that variable
Go to 2. Repeat until we don't hit failure anymore.

We can do this because the binary is checking the flag one character at a time and updating it's memory as it goes. So we know immediately when we have failed, and also at what point we've failed. That reduces the search space drastically. If we use lower case, upper case, numbers, and some symbols for our search space, we have a total search space of 78*41 == 3198 at worst case. That size search space is very doable for any modern machine.

Gdb exposes a gdb module for python. It has a lot of functionality that allows you to more programmatically interact with gdb. I've honestly never taken the time to learn most of it. The one command I use is called gdb.execute(). All it does is takes in your input and runs it as if you typed it into gdb directly. This way you don't really need to learn the gdb python bindings if you don't want to.

There seems to have been a little bug in this program. Not sure if it was intentional, but the "T" in OpenCTF didn't pass validation. The rest of the string works fine, so in this script I just bypass this index. Without this bug the dynamic script would be cleaner.

#!/usr/bin/env python

import gdb

keySpace = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!#$%&()+,-.:;_{|}~ "
#keySpace = [chr(x) for x in range(1,256) if x not in [ord("\""),ord("`")]]

# Gdb will make you press enter every page-full of data by default. Since we
# will be creating a bunch of text, let's turn this off
gdb.execute("set pagination off")

# Initial key
key = []
for i in range(41):
    # I set the value as the index to our keySpace, not an input value itself.
    # This simplifies incrementing through the search space a little
    key.append(0)

# Failure and win addresses
fail = 0x080492ad
win = 0x804930d

# Set the breakpoints
gdb.execute("break *{0}".format(hex(win)),True,True)
gdb.execute("break *{0}".format(hex(fail)),True,True)

# Need to keep track of which index we're working on
failIndex = 0
retry = False

# Loop!
while True:
    print("Trying {0}".format(''.join([keySpace[x] for x in key])))

    # This is a hack due to that problematic T character.
    if not retry:
        # (re)Start it up
        gdb.execute("r \"{0}\"".format(''.join([keySpace[x] for x in key])),True,True)
    else:
        # We're dynamically skipping a char. We need to sometimes just continue
        retry = False
        gdb.execute("c",True,True)

    # Grab eip
    eip = int(gdb.execute("p/d $eip",True,True).split(" ")[-1])
    # Did we win?
    if eip == win:
        print("Found winning key: {0}".format(''.join([keySpace[x] for x in key])))
        break

    # If it's not win or fail, assume we're failing on the same index
    if eip != fail:
        key[failIndex] += 1
        continue

    # What index failed? This looks up the index value the program is using at
    # this point in execution. It is read off of the stack using gdb.
    failIndex = int(gdb.execute("x/1d $esp+0x18",True,True).split("\t")[1])

    # There's a bug in this program. Let's push past it.
    if failIndex == 5:
        # Basically just skipping this and assuming success
        gdb.execute("set $eip = 0x80492b5",True,True)
        retry = True
        continue

    # At this point, we know our current character is wrong. Try the next.
    key[failIndex] += 1

There are two ways to run a script like this. In gdb you can run the command "python-interactive" and you will get a python shell with the gdb module exposed. Perhaps the better way to run a script like this is as an argument to gdb.

$ gdb -x solve_dynamic_gdb.py ./neophyte_reversing_ccabcc8f0b9900638a75017f2d6dc029

Sit back and watch text scroll :-)

OpenCTF 2016: neophyte_reversing

Symbolic Approach

Dynamic Approach