Introduction
------------
The ZIP format is one of the most widely used compresion/archival
programs on computers systems, its use is even more extended on Windows
plataform, with WinZIP program.
Known Attacks
-------------
The PKZIP encryption scheme have been proved to be weak in a lot of
papers that are listed at the end of this paper. We found an other way to
attack the encryption scheme using reversing enginiering in WinZIP IBDL32.dll.
A known problem is to get valid plain text in order to do an attack
using the "know plain text technic" [1]. Mikel Stay published at 2001 "ZIP
Attack with Reduced Known-Plaintext" [2], that improved the firt attack, but
the problem to get plaintext stil existed.
Getting plaintext is so dificult because plaintext is on compression
form and a minimum change on data would represent a great alteration, so if
we didn't know a full file content included on zip file we couldn't do
anything. We will discuse if deflating system is used the know plaintext
attack, turns to be fearly easy.
The encrytion scheme
--------------------
Status change functions:
void update_keys(char p) {
key0=crc32(key0, p);
key1=key1 + (key0 & 0xff);
key1=key1 * 134775813 + 1;
key2=crc32(key2, key1>>24);
}
char decrypt_byte(char b) {
unsigned short tmp;
tmp=key2 | 2;
return(tmp * (tmp^1)>>8);
}
To inizializate the keys:
..
key0=305419896;
key1=591751049;
key2=878082192;
for(i=0 ; i<strlen(pass) ; i++) {
update_keys(pass[i]);
}
..
To code a byte of data:
..
tmp=byte^decrypt_byte(byte);
update_keys(tmp);
..
Besides ,12 random bytes are prepended to the compressed data in order
to make plaintext attack more dificult. This bytes are very important to
initializate the state of the stream code. Because if we know the state of the
stream code on any time we can reverse until we get to the beginning.
The attack
----------
Most ZIP coders use rand function to generate this 12 random bytes, but
other zippers use own function, this is the case of WinZIP.
Attacking this encryption scheme using this 12 prepended random bytes
is not a new idea , [2] describes this kind of attack but a minimum of 5
files in a zip are needed in order to succeed.
If we reverse engineering WinZIP rand(m generation code, we find the
following code.
46c5a0 push ebp
46c5a1 mov ebp, esp
46c5a3 mov eax, [IBDL32.dll:Seed]
46c5a8 mov edx, eax
46c5aa add edx, edx
46c5ac add edx, eax
46c5ae shl edx, 2
46c5b1 add edx, eax
46c5b3 mov ecx, edx
46c5b5 shl ecx, 4
46c5b8 add ecx, eax
46c5ba mov edx, ecx
46c5bc shl edx, 8
46c5bf sub edx, eax
46c5c1 shl edx, 2
46c5c4 lea ecx, [eax+edx]
46c5c7 mov [IBDL32.dll:Seed], ecx
46c5cd mov eax, [IBDL32.dll:Seed]
46c5c2 sar eax, 10h
46c5c5 mov ecx, eax
46c5c7 and ch, 7fh
46c5ca movzx edx, cx
46c5cd mov eax, edx
46c5cf ret
It seems to be an ofuscated code, but if we analize this we can see
that C code may be like.
unsigned short rand()
{
seed=0x343fD * seed;
return ((seed >> 16)&0x7fff);
}
A normal rand implementation looks like:
unsigned short rand()
{
seed=0x343fD * seed + 0x269ec3;
return ((seed >> 16)&0x7fff);
}
Initialy seems that the person who wrote this code forgot to add
0x269ec3 to seed, but this can leed to a security problem, because the posible
secuences are reduced from 2^(12*8) to 2^(3*8). We will discuss the concrete
mathematical apects of rand on a future paper, now we will show how to
crack zip file using this miss.
Reducing the secuences makes more easy to do a bruteforce attack and
then gess the state of the stream coder (key0, key1, key2). We can return to
initial state using this known formules form pkcrack source code (stage2 line
175):
/* The equation from section 3.3 is used twice here:
* (1) key1_{n-1} + LSB(key0_n) = rhs = (key1_n - 1) * INVCONST
* and
* (2) key1_{n-2} + LSB(key0_{n-1}) = (key1_{n-1} - 1) * INVCONST
*
* At this point we know key1_n, MSB(key1_{n-1}) and MSB(key1_{n-2}).
*
* From (2) follows
* MSB(key1_{n-2}) = MSB((key1_{n-1} - 1) * INVCONST - LSB(key0_{n-1}))
* Inserting (1) yields
* MSB(key1_{n-2}) = MSB((rhs - 1)*INVCONST -
* LSB(key0_n)*INVCONST - LSB(key0_{n-1}))
* which means that either
* (a) MSB(key1_{n-2}) = MSB((rhs - 1)*INVCONST) -
* MSB(LSB(key0_n)*INVCONST - LSB(key0_{n-1}))
* or
* (b) MSB(key1_{n-2}) = MSB((rhs - 1)*INVCONST) -
* MSB(LSB(key0_n)*INVCONST - LSB(key0_{n-1})) - 1
*
* It can easily be verified that for any two bytes b1, b2:
* MSB( b1*INVCONST + b2 ) = MSB( b1*INVCONST )
* (simple exhaustive test on 2^16 combinations)
*
* We have computed diff = MSB((rhs - 1)*INVCONST) - MSB(key1_{n-2}).
* Now all we have to do is find values for key0_n so that
* (following from (1))
* MSB(key1_{n-1}) = MSB(rhs-LSB(key0_n))
* and (following from (a) and (b)) either
* diff = MSB(LSB(key0_n)*INVCONST)
* or
* diff = MSB(LSB(key0_n)*INVCONST) + 1
*
* Candidate values are selected using the precomputed lookup table mTab2.
*/
Proof of concept
---------------
We have been able to exploit this weak random generation on ZIP files
with more that 3 files (12*3 => 36 bytes of known plaintext), but with less
than two hours on a Pentium 500Mhz with 128Mb.
Attacks to ZIPs with less than 3 files might be also posible because
in WinZIP the two CRC most important bytes are estored within the 12
"random" numbers.
tocrack.zip is a file created with WinZIP 8.0, the password is
"&THPOL101%ISLAME@|1" whis is a 19 length password , ( wich impossible to crack
if we bruteforce the password ) , but what we are bruteforcing is the 3 keys
that are generated with the password.
[rmn@mightsoft ~]$ zipinfo tocrack.zip
Archive: tocrack.zip 679 bytes 3 files
-rw-rw-rw- 2.0 fat 138 T- defX 8-Feb-03 01:31 file3.txt
-rw-rw-rw- 2.0 fat 163 T- defX 8-Feb-03 01:31 file2.txt
-rw-rw-rw- 2.0 fat 270 T- defX 8-Feb-03 01:31 file1.txt
3 files, 571 bytes uncompressed, 339 bytes compressed: 40.6%
[rmn@mightsoft ~]$ ./zipproof -p tocrack.zip
[*] generating posible secuences.. DONE.
[*] reducing number of posible Key2.. DONE.
[*] Bruteforcing:
[-] Key2 => 0x6a54f21e
[*] Generating initial keys:
[-] Key0 => 0xaca8571c
[-] Key1 => 0x439e8759
[-] Key2 => 0x508d8f22
[*] Tryng to get password.. Not found!
[E] Is not posible to find a password for these keys, you can use
findkeys tool (from pkcrack) to get it.
[rmn@mightsoft ~]$
It was not posible to recover the password because it was too large,
but with that keys is very easy to extract data files from ZIP (using pkcrack
zipdecrypt).
[rmn@mightsoft ~]$ ./zipdecrypt 0xaca8571c 0x439e8759 0x508d8f22 tocrack2.zip result.zip
Decrypting file1.txt (bb6c9531638e70d425ebe60b)... OK!
Decrypting file2.txt (0f57542dbf0e18517a5cee0b)... OK!
Decrypting file3.txt (2d2bc008f19607809298f90b)... OK!
Affected targets
----------------
This problem is in IBDL32.dll, that is used in some ZIP compresors like
the afore mentioned WinZIP.
Recomendations
--------------
Even if this problem is not present in your ZIP compressor the ZIP
encryption scheme is very weak and should not be used to encrypt sensitive
data. Use zip and then encrypt with your favorite encryptation program.
Credits
-------
Mike Stevens - <mikestevens (at) softhome (dot) net [email concealed]>
Elisa Flanders - <eflanders (at) springfield (dot) com [email concealed]>
Greetz
------
Eli Biham, Paul C. Kocher, Mikel Stay, for the papers.
Peter Conrad, for pkcrack.
Cocacola, for keeping us awake.
Bibliography
------------
[1] - "Known Plaintext Attack on the PKZIP Stream Cipher".
Eli Biham and Paul C. Kocher
[2] - "ZIP Atack with reduced Known-Plaintext".
Mikel Stay
[3] - PKZIP Specs: APPNOTE.txt.