This challenge was lovingly called "Program Interactive Tracing as a Symbolic Service" (PITASS). They clearly came up with the acronym first... Regardless, we're given the following four files:

Dockerfile, headerquery, and pitass.py

The Dockerfile was mostly not useful given we were missing a bunch of it. Headerquery is an elf that we can run and pitass the python script that we get dropped into when we connect.


PITASS.py

It took a bit to get used to what pitass.py is doing, but in the end, most of the code in that file (or really all of it for this part) is just a harness that forces specific allowed interactions with angr. Here's an example run:

 

###############################################################################
###############################################################################
###############################################################################
####################################################################### WELCOME
############################################################################ TO
########################################################################### THE
####################################################################### PROGRAM
################################################################### INTERACTIVE
####################################################################### TRACING
########################################################################## AS A
###################################################################### SYMBOLIC
####################################################################### SERVICE
###############################################################################
###############################################################################
###############################################################################

        %%%%%%    %%%   %%%%%%%    %     %%%%%   %%%%%
        %     %    %       %      % %   %     % %     %
        %     %    %       %     %   %  %       %
        %%%%%%     %       %    %     %  %%%%%   %%%%%
        %          %       %    %%%%%%%       %       %
        %          %       %    %     % %     % %     %
        %         %%%      %    %     %  %%%%%   %%%%%

This service empowers the everyhacker to utilize the cutting-edge angr
binary analysis framework! Fear not: though angr is daunting, this service
scopes the challenge to the use of angr purely for symbolic tracing.
Don't be scared!

# Greetz

It is important to stress that we stand on the shoulders of giants [1,2,3,4].
Much of angr's design, implementation, and optimization was inspired by
the insights in those papers. Anyone interested in the functionality,
implications, and failure modes of symbolic execution should read those
papers.

[1] KLEE: Unassisted and Automatic Generation of High-Coverage Tests
    for Complex Systems Programs
[3] Billions and Billions of Constraints: Whitebox Fuzz Testing in Production
[4] Unleashing MAYHEM on Binary Code
[5] Enhancing Symbolic Execution with Veritesting

# Are you ready for the PITA?

Which binary do you want to trace?
1 <- false
2 <- headerquery
0 <- Done.
Choice: 2
Loading binary /bins/headerquery...
Traces:
What would you like to do?
1 <- Start a trace.
2 <- Resume a trace.
3 <- Delete a trace.
0 <- Done.
Choice: 1
Current size of input: 0
Input specification.
1 <- Add unconstrained symbolic variable.
2 <- Add constrained symbolic variable.
3 <- Add concrete value.
0 <- Done.
Choice: 0
What do?
1 <- Step
2 <- Dump stdin
3 <- Dump stdout
4 <- Dump stderr
5 <- Concretize register
6 <- Symbolize register
7 <- Print constraints
0 <- Done.
Choice:

 

So we initially have an option of what to load (headerquery being the obvious choice). Next, we have an option to start a trace, resume, or delete. To be honest, I didn't bother using that management feature since you can connect again and run from the beginning anyways, it was easier to just do that.

 

When setting up your trace for the first time, you're given the option of adding input. From a harness perspective, this boiled down to populating stdin for your program. You could either give your program concrete values (literal values), constrained symbolic values (these will be created as BVS symbolic variables, but constrained to the exact input), or fully symbolic variables.

 

Finally, once you're done setting up your stdin, you get to start tracing the program. One thing you will notice right away is that, say you give the input as 4 symbolic bytes, the harness will error out on you with "Assertion violation: This is a tracing interface, not a general symbolic exploration client!!!". Looking at the source, what is happening is that pitass.py is explicitly NOT allowing state splitting. Normally, when you have a jump instruction that could go either way, angr will actually take both paths, splitting the state. However, for this challenge, the authors have explicitly checked for this and will not allow you to do anything that would cause this to happen. This becomes a big pitass...

 

From a running perspective, you can step, which from angr's definition of step will execute one angr defined code block. In theory (and in these elf examples) the block size is well defined. In practice, an angr 'step' might not place you quite where you expect. Dump stdin/out/error is self explanatory. The next two give us the ability to control execution part way through the program.

  • Concretize register -- This actually reads in the current value (symbolic or otherwise) from the register specified, passes it to the solver, and returns a single possibility for that value. I don't believe that one value is deterministic (maybe wrong on that), because it actually gets passed to Z3 (or can be passed to z3) if the register contains symbolic expressions, and z3 isn't guaranteeing the same result back each time. The now concrete value then gets stored back in the register.
  • Symbolize register -- This one confused me initially until i read the pitass.py code more thoroughly. What is actually happening here is that angr is reading in the current value from the register, creating a new variable for that register, storing that new variable in the register, and finally adding a constraint to the solver to say what that variable is. The net result is effectively the same as changing the variable to a constrained symbolic expression. Constrained being the key.

 

Finally, the print constraints simply dumps out the angr solver's constraints.


Headerquery

Headerquery turned out to be pretty simple. Here's the disassembly in a nutshell

 

HeaderQuery

 

The first block just opens the flag and reads it into a buffer. It then reads in 4 bytes from the user into a separate buffer. Immediately after, it compares the user's input against 0xff, with the size of input and comparison indicating that we're actually feeding in an integer. It then loads up the char from the array, checks if the integer is greater than two, and below (not in picture) either errors out if the index is greater than two or prints out your character.

 

Presumably, this means that we should only be able to leak out index 0,1 and 2. However, after some playing with the binary, it occurred to me that at any point in execution I could effectively leak the register value by simply making that register symbolic (due to how that function works as described above). With that in mind, i could leak any character i wanted out by simply setting the index, stepping until i got to the "checking input" puts call, making the eax register symbolic, then checking my constraints list. This is what I did, iterating through the indexes until I got the flag out.

 

win.py

 

OOO{memory_objects_get_you_every_time}