Write-Up: Foxy Fox from Newbie CTF 2019

Challenge

Look at that cute fox over there!!!

http://www.talesshop.com/?page=product&query=fox

Please Answer My Question!

Executing the program only outputs this rather useless message:

Welcome To Kornewbie CTF, CTF Challengers!
Before You Solve This Problem, Please Watch This Video.
Video Link : https://www.youtube.com/watch?v=jofNR_WkoCE
Please Answer My Question, What Does Fox Say?

Binary analysis

The binary contains multiple code paths in the main function which are selected at random. However, only one of those code paths is relevant, because it’s the only code path which checks for a flag and prints the KorNewbie{%s} string:

/* somewhere in main */
buf[0] = 0;
*(_DWORD *)&buf[1] = 0;
*(_DWORD *)&buf[5] = 0;
*(_DWORD *)&buf[9] = 0;
*(_DWORD *)&buf[13] = 0;
scanf("%s", buf);
if(strlen(buf) == 16 && compute_hash(buf, 0x7C35D9A3) == 0xF92AC34) {
	if(buf[3] == '_' && buf[7] == '_' && buf[12] == '_') {
		if(buf[0] >= 'A'
				&& buf[0] <= 'Z'
				&& buf[1] >= 'A'
				&& buf[1] <= 'Z'
				&& buf[2] >= 'A'
				&& buf[2] <= 'Z'
				&& buf[4] >= 'A'
				&& buf[4] <= 'Z'
				&& buf[5] >= 'A'
				&& buf[5] <= 'Z'
				&& buf[6] >= 'A'
				&& buf[6] <= 'Z'
				&& buf[8] >= 'A'
				&& buf[8] <= 'Z'
				&& buf[9] >= 'A'
				&& buf[9] <= 'Z'
				&& buf[10] >= 'A'
				&& buf[10] <= 'Z'
				&& buf[11] >= 'A'
				&& buf[11] <= 'Z'
				&& buf[13] >= 'A'
				&& buf[13] <= 'Z'
				&& buf[14] >= 'A'
				&& buf[14] <= 'Z'
				&& buf[15] >= 'A'
				&& buf[15] <= 'Z') {
			printf("Good! Check The Flag!\n");
			printf("KorNewbie{%s}\n", buf);
		}
	}
}

There is an input for a single word (scanf). Afterwards, the length is checked (16 characters) and a hash is computed. If the hash matches the expected hash, the structure of the string is checked:

only uppercase letters
_ at positions 3, 7, and 12

Therefore, the string we are looking for is supposed to have this structure:

AAA_AAA_AAAA_AAA

The hash function itself looks like this:

int compute_hash(const char *str, int magic)
{
	unsigned int i;
	int hash = 0;

	for(i = 0; i < strlen(str); i++)
		hash = magic ^ ((str[i] | 0x20) + __ROR4__(hash, 8));
	return crt_check(1, hash);
}

(The crt_check function in the hash function just performs some runtime checks and then returns the second argument, this function is not interesting for us.)

This function computes a hash using the magic constant. Interestingly enough, this hash function was taken from a real world malware sample. Unfortunately the published analysis of that malware sample does not contain a useful analysis of the hash function itself.

Basic Idea

The only way to solve this challenge is by breaking the hash function. A simple brute force attack is way too slow and even with a dictionary the search space is still too big. The solution is to use SMT with a powerful SMT solver, which deals with all the complexity for us. We only have to encode the problem in SMT.

Encode the problem in SMT

Encoding the problem in SMT is rather simple: just unroll the loop of the hash algorithm and add the constraints on the input from the main function. We use the theory of bit vectors to encode our 32bit variables and operations.

Definitions

First define constants for the lower and upper limit of the input:

(declare-const lower (_ BitVec 32))
(assert (= lower #x00000041))	; A
(declare-const upper (_ BitVec 32))
(assert (= upper #x0000005A))	; Z

We also need the magic constant from the main function as well as the constant that is or’d to the input:

(declare-const magic (_ BitVec 32))
(assert (= magic #x7C35D9A3))

(declare-const cons (_ BitVec 32))
(assert (= cons #x00000020))

Loop Unrolling: First Iteration

Now we need to translate the hash function itself. In SMT everything has to be unrolled, and variables from the original program have to be in SSA form. This means we need one set of variables for every iteration of the hash function.

Let’s start with the first iteration. The hash variable is initialized with 0. We call this out0:

(declare-const out0 (_ BitVec 32))
(assert (= out0 #x00000000))

Now apply the constraints from the main function on the first character of the input (only upper case letters):

(declare-const i0 (_ BitVec 32))
(assert (and (bvule i0 upper) (bvuge i0 lower)))

Then, we define a temporary variable t0 = str[0] | 0x20:

(declare-const t0 (_ BitVec 32))
(assert (= t0 (bvor i0 cons)))

This variable is XOR’d with the magic constant, which results in the hash value after the first iteration. Since the previous hash value is 0, we can ignore the ROT part here:

(declare-const out1 (_ BitVec 32))
(assert (= out1 (bvxor t0 magic)))

Loop Unrolling: Second Iteration

Now we have to unroll the second iteration. Again, we start with the constraints on the input:

(declare-const i1 (_ BitVec 32))
(assert (and (bvule i1 upper) (bvuge i1 lower)))

Then, we define the temporary variable it1 = str[1] | 0x20:

(declare-const it1 (_ BitVec 32))
(assert (= it1 (bvor i1 cons)))

This time we have to add the rotated old hash value:

(declare-const rot1 (_ BitVec 32))
(assert (= rot1 ((_ rotate_right 8) out1)))
(declare-const prexor1 (_ BitVec 32))
(assert (= prexor1 (bvadd it1 rot1)))

The XOR with the magic constants results in the hash value after the second iteration:

(declare-const out2 (_ BitVec 32))
(assert (= out2 (bvxor prexor1 magic)))

Loop Unrolling: Third .. Sixteenth Iteration

Unrolling the remaining loop iterations works exactly in the same way. However, there is one special case with the checks for _ characters. At those positions, the input is asserted to be _ instead of the constraint for upper case letters:

(declare-const i3 (_ BitVec 32))
(assert (= i3 #x0000005f)) ; _

Check the Hash

After unrolling all loop iterations, the final hash value out16 is checked by comparing it to our expected hash:

(assert (= out16 #x0F92AC34))

Compute the Flag

A solution can now be computed by calling an SMT solver:

boolector -m -x fox.smt2

The SMT solver now checks if there is a “satisfying assignment”, which means it tries to find an input which produces the hash we’re interested in. It then outputs the variable assignment, and we get this result:

KorNewbie{RRV_NQN_UAAO_KLE}

Unfortunately, we cannot submit this flag because the online system tells us it’s “incorrect”.

Guessing the Flag

So … we computed a valid flag, but we can’t submit it because the system tells us it’s invalid. This is where the real fun starts. We now have to guess the flag.

We can choose letters and e.g. specify that the word FOX has to appear. Unfortunately we can still get lots of “incorrect” flags, even if we force words. After a few hours of playing this “guessing game”, I guessed that the flag might start with THE_FOX_SAYS_, since the program asks us What Does Fox Say?.

Encoding such a constraint is rather simple, just force the input to specific values:

; guess the first few words
(assert (and (= i0 #x00000054) (= i1 #x00000048) (= i2 #x00000045))) ; THE
(assert (and (= i4 #x00000046) (= i5 #x0000004F) (= i6 #x00000058))) ; FOX
(assert (and (= i8 #x00000053) (= i9 #x00000041) (= i10 #x00000059) (= i11 #x00000053))) ; SAYS

Now, with this guess we finally get our flag that can be submitted:

KorNewbie{THE_FOX_SAYS_SUA}

Final SMT file

Unique Solution

Challenges without unique solutions suck. After we already guessed the solution, the challenge was “fixed” with the following change:

ref[0] = 0;
ref[1] = 0x1C;
ref[2] = 0x11;
ref[3] = 0xB;
ref[4] = 0x12;
ref[5] = 0x1B;
ref[6] = 0xC;
ref[7] = 0xB;
ref[8] = 7;
ref[9] = 0x15;
ref[10] = 0xD;
ref[11] = 7;
ref[12] = 0xB;
ref[13] = 7;
ref[14] = 1;
ref[15] = 0x15;

if(...) { /* this is the long if from before */
	ok = 1;
	for(i = 0; i < strlen(buf); i++) {
		if((buf[i] ^ buf[0]) != ref[i])
			ok = 0;
	}
	if(ok == 1) {
		printf("Good! Check The Flag!\n");
		printf("KorNewbie{%s}\n", buf);
	}
}

This change is supposed to make the solution unique.

In SMT, this contstraint looks like this:

(assert (= (bvxor i0 i0) #x00000000))
(assert (= (bvxor i1 i0) #x0000001C))
(assert (= (bvxor i2 i0) #x00000011))
(assert (= (bvxor i3 i0) #x0000000B))
(assert (= (bvxor i4 i0) #x00000012))
(assert (= (bvxor i5 i0) #x0000001B))
(assert (= (bvxor i6 i0) #x0000000C))
(assert (= (bvxor i7 i0) #x0000000B))
(assert (= (bvxor i8 i0) #x00000007))
(assert (= (bvxor i9 i0) #x00000015))
(assert (= (bvxor i10 i0) #x0000000D))
(assert (= (bvxor i11 i0) #x00000007))
(assert (= (bvxor i12 i0) #x0000000B))
(assert (= (bvxor i13 i0) #x00000007))
(assert (= (bvxor i14 i0) #x00000001))
(assert (= (bvxor i15 i0) #x00000015))

If we ask the SMT solver for a model, we directly get the “correct” flag.

To convince ourselves that the solution is now really unique, we can check if there is another solution by “blacklisting” the previous solution:

; blacklist 'THE_FOX_SAYS_SUA'
(assert (not (and
	(= i0 #x00000054)
	(= i1 #x00000048)
	(= i2 #x00000045)
	(= i3 #x0000005f)
	(= i4 #x00000046)
	(= i5 #x0000004f)
	(= i6 #x00000058)
	(= i7 #x0000005f)
	(= i8 #x00000053)
	(= i9 #x00000041)
	(= i10 #x00000059)
	(= i11 #x00000053)
	(= i12 #x0000005f)
	(= i13 #x00000053)
	(= i14 #x00000055)
	(= i15 #x00000041)
)))

Now the SMT solver tells us unsat, which means there is indeed only one solution.

Useful Links

boolector: efficient SMT solver
smtlib2: description of the smtlib2 language