The Chabaud-Joux Attack Against SHA-0
Posted on June 30, 2008
Filed Under Computer Science |
Look [here ] if you want to learn about cryptographic hash functions, their properties, or their applications.
History of SHA-0
In 1993, SHA-0 descended from heaven. The NSA published SHA-0 in FIPS-180, which contained the algorithm with no explanation of its security. Concerns were thrown aside as the algorithm became widely adopted. The stated collision resistance [the resistance most pertinent to the attack listed below] was the theoretical maximum of 2^80.
In 1995, the NiST released an improvement on the algorithm in FIPS 180-1 . The new algorithm (called SHA-1) had just an extra shift operation in the message expansion. Other than this detail, the new algorithm was identical to SHA-0. According to the NSA, this corrected a weakness in the message expansion.
In 1998, researchers Chabard and Joux released an elementary attack on SHA-0, described below. It exploited a fundamental weakness in the simple round structure of SHA-1, combined with an exceptionally simple message expansion. Their attack employs a search algorithm that requires only examining 2^61 messages instead of 2^80 messages to find a collision.
This was only a theoretical break in SHA-0: the collision resistance was still improbably high (though breakable through a distributed computing project), and the only collisions produced were gibberish.
More than one weakness in SHA-0 was found in the meantime! Biham and Chen discovered another route to finding collisions and near-collisions, and Chinese researchers Yang, Yu, and Yin have effectively pwnt [to use a technical term] SHA-0. In their most recent attack, they found a search space that requires examining only 2^39 different messages. This is possible on a beefy desktop computer. They have only released the attack vector and a high-level overview of how they arrived at the vector. They have not released the details for how they developed their attack in an English language journal.
The Biham and Chen attack is actually cooler than the Chibaud-Joux attack, but much more complicated. If this essay is well-received, I will write a post on that.
Description of SHA-0
SHA-0 is a block hash with an input of 512 bits. This means that any input message is broken into 512-bit blocks. The output from processing the first block is the input for processing the second block, and so on and so forth. The output of the final processing step is your hash value.
Writing the description is far less clear than just reading the pseudocode, so I urge you to read the pseudocode listed on Wikipedia or a reference implementation I wrote in Python [note: Python is not my primary language, and I wrote it so that a Math major with no coding experience could understand it, so I make no guarantees about it.]
The overall structure of SHA is as follows:
1) Message padding: Because SHA-0 operates on blocks, we need to make sure that we can evenly break the message into 512-bit blocks. The length of the message must also be appended on the end of the message. If the message + the length is not 0 (mod 512), then in between the message and the appended length, the string "100…..000" is added to make the message the right length.
2) Message Expansion: The message is then divided into 512-bit blocks. Each 512-bit block can be broken down into 16 32-bit words. However, just plugging these 16 words into the round functions (described below) would give attackers too much control over the internal state of the round function, the 16 words are expanded into 80 words using the recurrence W[i] = W[i-3] ^ W[i-8] ^ W[i-14] ^ W[i-16] for 17<i<80. Note that the first 16 words of the expansion are the 16 words from the block.
3) 80 rounds : The input of the hash function, in addition to the message, is also a 5 word magic constant. These 5 words are transformed using the results of the message expansion, and then added to the original magic word to produce the hash.
After that, the resulting 5 registers are added with the initial state of the 5 registers. For the first block, this means the result is added to the "magic number".
Two messages side-by-side
If we take two messages, m1 and m2, and let m1=m2, obviously SHA-0(m1) = SHA-0(m2). Cryptographic hash functions MUST be deterministic by definition.
The ultimate goal is to find two messages that hash to the same value: Hash(some_message) = Hash(other_message). This, by itself, is overwhelming. So Chabaud and Joux decided to cheat: They would find collisions for successively stronger versions of SHA-0 until they found a collision attack against the full SHA-0.
Local Collisions
Chabaud and Joux needed better tools to attack SHA-0, so they decided to find necessary conditions for finding collisions for just a few rounds of SHA-0. They decided to see what happens if they flip a single bit in a round, and ignore the message expansion. So they are allowed to put whatever they want into each round of the hash function by the rules of this simplified attack.
Let’s go back to identical messages m1 and m2. Let’s alter m2 a little: we will flip a single bit. If we make a change to a bit in some round r , the register A will be the only register affected in round r . In round r+1 , the registers A and B will be affected. So on and so forth, so it will take until round r+4 for the bit difference in m2 to propagate all the way to the last register, E .
Chabaud and Joux decided that they wanted to find the minimum amount of rounds necessary to createa local collision from r . It turns out that if you have full control over the input to the hash function, you can create a local collision at round r+5 : The hash at round r+5 of m1 and m2 is the same even though m1 and m2 are different functions. This is called a 6-round local collision (Rounds: r, r+1, r+2, r+3, r+4, r+5 ).
How?
Let’s look at round r+1 , the round after the first bit-change occurred. The changes that happen to the A register are defined as:
A = (A <<< 5) + f (B, C, D) + E + round_constant + input[r+1 ]
where f is one of the 4 round functions defined in SHA-0. We have full control over the term "input[r+1]".
and the changes that happen to the B register are:
B = A
We can’t correct the difference in B that propagates from A (because we have no additive term in this equation), but we CAN correct the difference that occurs in A. After all, we know what bit we changed, so getting A to match up in m1 and m2 is easy: we just have to do the math.
Chabaud and Joux define specific masks for r+1, r+2, r+3, r+4, and r+5 , and voila! A local collision.
Why does this work?
This works because of a special fact about the A register: it is the only register that is directly modified by the message, AND it is the only register whose value is produced by the values of other registers. Since the other registers aren’t sharing bit differences, we only have to fight the changes in the A register in order to succeed.
Problems!
If it were really this easy to find a local collision, SHA-0 would have been completely unusable. Fortunately for its strength, there are a few complicating factors:
Message expansion : The message expansion produces an additional constraint: this produces the need to chain together a bunch of local collisions inside of the 80 rounds in order to come up with one full collision. The constraints of the message expansion need to be satisfied by your chosen starting points for all of the local collisions (as your bit differences that you introduce will be scattered by the message expansion in the exact same way).
Round-specific functions : To make matters worse, there are round-specific functions that change every 20 rounds. Each of the functions are functions of the registers B , C , and D , and can therefore be viewed as functions of past values of A . Two of the functions are non-linear : they can’t be modeled by an XOR, and they have their own specific constraints for when a local collision can be produced. The round functions prevent the local collision from being automatic: depending on the round we start the collision, we need to search for which bits we need to flip. The Chabaud-Joux attack on SHA-0 has a search space of 2^61 for bits that need to be flipped in order to find the correct hash.
The weak message expansion
However, there is still an extra foothold that Chabaud and Joux were able to use: the message expansion of SHA-0 is just an XOR. When they were satisfying the constraints of the bits they were flipping, they were able to keep flipping the same bit in each message word instead of needing to come up with complicated arrangements of bits that needed to be flipped. The extra rotation function added to SHA-1 caused this particular attack to be impractical: the search space for bits to be flipped rose dramatically, and new approaches had to be developed.
Popularity: 15% [?]
Comments
Leave a Reply
