Defenit CTF 2020 - HashCungDol WriteUp

By 4cad @ TheAdditionalPayphones, 2020-06-11

This was one of the trickier Crypto challenges of the CTF. Given a custom hash function, you are given 10 hashes one at a time. For each of these hashes, you need to find 5 collisions. If you can find at least 5 collisions for all 10 of the hashes then you will be given the flag.

The actual code of the challenge can be found here, but below is just the hash function:

In [1]:
from Crypto.Random.random import getrandbits
from base64 import b64decode, b64encode
from Crypto.Util.number import long_to_bytes, bytes_to_long
flag = 'flag{1234}'

BITS = 220

def shift(n,i):
    n1=(n<<i)%0x800
    n2=(n>>(11-i))
    return n1^n2


def PIE(i, A, B, C):
    if i<16:
        return (A&B)|((~A)&C)
    return (A&B)|(B&C)|(C&A)

def pie(j):
    if(j<16): return j
    elif(j==16): return 0
    elif(j==17): return 4
    elif(j==18): return 8
    elif(j==19): return 12
    elif(j==20): return 1
    elif(j==21): return 5
    elif(j==22): return 9
    elif(j==23): return 13
    elif(j==24): return 2
    elif(j==25): return 6
    elif(j==26): return 10
    elif(j==27): return 14
    elif(j==28): return 3
    elif(j==29): return 7
    elif(j==30): return 11
    else: return 15

def HashGenerate(value):
    n = 3
    s = [3]*16 + [7]*16
    Q = [0]*36
    m = [0]*16
    X = []
    for i in range(0,20):
        X.insert(0, value&0x7ff)
        value >>=11
    
    print('X =',X)
        
    for i in range(-3, 1, 1):
        Q[n+i]=X[i+3] # Q[0:4] = X[0:4]
    for i in range(0,16):
        m[i]=X[i+4] # m = X[4:20]
    for i in range(0, 32):
        Q[n+i+1]=shift(( (Q[n+i-3] + PIE(i,Q[n+i],Q[n+i-1],Q[n+i-2]) + m[pie(i)])%0x800 ), s[i]) 
 
    Y = []
    for i in range(0,4):
        Y.append((Q[n+i-3] + Q[32+i])%0x800)

    Y[1]=(Y[0]<<33)^(Y[1]<<22)
    Y[0]=Y[2]<<11
    Y[1]^=Y[0]^Y[3]

    return Y[1]

I am only going to give a summary of the attack here. I was writing it in C++ because I thought I was going to need some extra speed, but in the end it worked well enough that I solved the challenge before the performance benefits of C++ became necessary, so my attack consists of the C++ hash collision finder Attack.cpp and the python coordinator.py.

Analysis of the Hash Function

The hash function takes as input up to 20 blocks of 11 bits each, and returns a 44 bit number. The hash function has two parts, the first part starts with the first 4 blocks of input as a seed and then computes 32 rounds of some operation. Each round outputs an 11 bit block which is consumed by the following 4 rounds, and the output of the final 4 rounds is added to the 4 seed blocks in a block-wise fashion to produce the final hash result.

So while the first 4 blocks of input are used as a seed, the remaining 16 blocks are used by the main rounds of the hash function to stir things up. In the code, this is represented by the m array such that m[i] = input[4+i]. The 32 rounds are split into two halves which each use every element from m exactly once, albeit in a different order.

The most important properties for finding collisions are the facts that:

  • If the last four rounds have a result of 0, then the hash result is simply the first 4 blocks of input.
  • If the previous 3 rounds have a result of zero, then the result of the current round is the result of the 4th last round added to an input value
  • The elements of `m` can be chosen to cancel out the value of 4 consecutive rounds to 0, which results in 3 of the last 4 rounds having a result of 0.

There is a lot of nuance and detail, but to understand it in more depth you would have to read the code yourself and do some thinking.

The Collision Finding Attack

Again, for details take a look at Attack.cpp as I am only going to give a summary here. The attack is probabilistic in nature, and there are two additional modifications on top of the base attack that were needed to get the probability high enough to pass the challenge.

Base Attack

The base attack yields around 4 collisions on average per target hash, and involves constructing an input that uses the zeroing trick mentioned earlier to give the attacker control over 3 of the 4 blocks of the resulting hash. It just so happens that one block has an effectively random value, and we have exactly one block (11 bits) of input that we can change without losing control over the other 3 blocks. So we brute force this block of input, since we have $2^{11}$ chances to get the correct block which has a probability of $\frac{1}{2^{11}}$.

This gives us 1 expected collision, and because we can repeat this attack for each of the 4 output blocks of the hash that turns into 4 expected collisions.

Variant 1 - Slightly More Brute Forcing

It just so happens that a minor variant of the base attack can be used in which we can only control the first 2 out of the 4 eleven bit output blocks. Luckily this variant also gives us $2^{22}$ chances to get the remaining $\frac{1}{2^{22}}$ probability block right.

Thus on the surface this gives us an expected 5 collisions per hash. But that is not good enough, since we need to get at least 5 collisions 10 times in a row.

Variant 2 - Magic Jitter

The final variant which was enough to get the solve had to do with what happens when the last block of input as exactly one bit set, which was usually 0 in most of the attacks mentioned above. Essentially, when most of the m values are 0 parts of the hash reduce to a series of rotations and bitwise operations (see the PIE function). By setting exactly one of the bits in this specific block of input, the bit will be rotated a few times but as long as it always corresponds to a 0 bit in the other intermediate value it will end up winding its way through the last 16 rounds of the hash without impacting anything other than the final round of the hash function.

Also, we know the impact it will have on the final round - essentially it gets rotated by a known amount and added into the result. So for each of the 11 bits in the 20th block of input, we try to get another collision by taking the inputs from the base attack, adding this jitter and accounting for the impact it will have on the last block of the hash.

I am not going to pretend to know how this impacts the expected number of collisions, but it did result in real collisions often enough that I was able to solve the challenge and get the flag in 3-5 attempts.

Defenit{St0p_the_c0llision_n0w}