INTRODUCTION
============
According to QuickTime's specification, The sample description atom
(STSD) stores information that allows QuickTime to decode samples in
the media.
It has the following structure:
0 DWORD Size
4 DWORD Type
8 BYTE Version
9 BYTE[3] FLAGS
12 DWORD Number of entries
16 DWORD Sample description table
The structure of each entry in the sample description table varies by
the media type, however the first four fields are the same for all
media types:
0 DWORD Sample description size
4 DWORD Data format
6 BYTE[6] Reserved
12 WORD Data reference index
These four fields may be followed by additional data specific to the
media type and data format.
For video media, the general sample description format is extended by
the following structure:
14 WORD Version
16 WORD Revision level
18 DWORD Vendor
22 DWORD Temporal quality
26 DWORD Spatial quality
30 WORD Width
32 WORD Height
34 DWORD Horizontal resolution
38 DWORD Vertical resolution
42 DWORD Data size
46 WORD Frame count
48 BYTE[32] Compressor name
80 WORD Depth
82 WORD Color table ID
VULNERABILITY DETAILS
====================
When the data format field (offset 4 of the sample description table
extension) is 'RVZA' (Apple Video), it is possible to trigger a sign
extension vulnerability which leads to a buffer underflow.
The following is the faulty sign extended MOV:
MOVSX ECX,WORD PTR SS:[ESP+4C]
[ESP+4C] contains a user controlled input, which is equal to
"((width+(4-width%4))*4 & 0xFFFF" where 'width' is taken from the RVZA
sample description entry (offset 30).
If width >= 0x5FFD, then [ESP+4C] >= 0x8000.
Sign-extending such values results in very large unsigned values, as
their most significant word becomes 0xFFFF (so 0x8000 is sign-extended
to 0xFFFF8000).
Deeper in the code, the user controllable sign-extended value is
treated as the size of a structure.
A vector of this structure is walked over:
[1] At each iteration the base pointer is incremented by the user's
controlled sign-extended value. This means that it is possible to
force the pointer to reference memory regions below the vector's VA:
ADD EAX,EDX ; EAX = vector, EDX = sign extended value
[2] At each iteration values are written to an element in the vector
(a single structure) which is referenced by the incremented pointer.
This means that it is possible to write to memory regions below the
buffer's VA.
MOV DWORD PTR DS:[EAX],EBX
MOV DWORD PTR DS:[EAX+4],EBX
MOV DWORD PTR DS:[EAX+4],EBX
MOV DWORD PTR DS:[EAX],EBX
IMPACT
======
By writing to memory regions below the buffer's VA, An attacker may
overwrite crucial data such as function pointers, flags, heap
structures and so forth. Doing so may allow an attacker
to alter the normal control flow of the application and execute arbitrary code.
A simple attack vector would be to lure the victim to browse to a web
site controlled by the attacker, which serves a malicious QuickTime
file that exploits this vulnerability.
TEST ENVIRONMENT
================
Windows XP Service Pack 3
QuickTime 7.6 (472)
REMEDIATION
===========
A new version of QuickTime (7.6.2) has been released in order to
address this issue.
IDENTIFIERS
==========
1. CVE-ID: CVE-2009-0955
2. BID: 35166
REFERENCES
===========
1. Apple's advisory: http://support.apple.com/kb/HT3591
2. The original blog post:
http://roeehay.blogspot.com/2009/06/apple-quicktime-image-description-atom.html