AMD Bulldozer Linux ASLR weakness Reducing entropy by 87.5%

2015.03.27
Risk: Medium
Local: Yes
Remote: No
CVE: N/A
CWE: N/A

A bug in Linux ASLR implementation which affects some AMD processors has been found. The issue affects to all Linux process even if they are not using shared libraries (statically compiled). The problem appears because some mmapped objects (VDSO, libraries, etc.) are poorly randomized in an attempt to avoid cache aliasing penalties for AMD Bulldozer (Family 15h) processors. Affected systems have reduced the mmapped files entropy by eight. After we found (and fixed) this weakness, we found a detailed white paper about this issue (Shared Level 1 instruction cache performance on AMD family 15h CPUs). Several workarounds were proposed, but none of them solved the problem in the way our proposal does. Our solution (see below) is not a workaround, but a solution that effectively avoids cache conflicts and does not jeopardizes ASLR entropy. The complexity, both in the number of lines of code and the timing overhead, of our proposed solution is negligible; that is, it does not have trade-offs AFAWK. This vulnerability is similar to: 1) Linux ASLR mmap weakness: Reducing entropy by half published on January 17, 2015. 2) CVE-2015-1593 - Linux ASLR integer overflow: Reducing stack entropy by four published on January 7, 2015. The following output is the run on an AMD Opteron 62xx class CPU processor under x86_64 Linux 4.0.0: $ for i in `seq 1 10`; do cat /proc/self/maps | grep "r-xp.*libc" ; done 7fbcdb388000-7fbcdb545000 r-xp 00000000 00:01 1387 /lib/x86_64-linux-gnu/libc.so.6 7f4c88a18000-7f4c88bd5000 r-xp 00000000 00:01 1387 /lib/x86_64-linux-gnu/libc.so.6 7f8d7e7b8000-7f8d7e975000 r-xp 00000000 00:01 1387 /lib/x86_64-linux-gnu/libc.so.6 7f6c314d8000-7f6c31695000 r-xp 00000000 00:01 1387 /lib/x86_64-linux-gnu/libc.so.6 7fccad6b0000-7fccad86d000 r-xp 00000000 00:01 1387 /lib/x86_64-linux-gnu/libc.so.6 7f53bcc50000-7f53bce0d000 r-xp 00000000 00:01 1387 /lib/x86_64-linux-gnu/libc.so.6 7f0c3c838000-7f0c3c9f5000 r-xp 00000000 00:01 1387 /lib/x86_64-linux-gnu/libc.so.6 7ffecb3c8000-7ffecb585000 r-xp 00000000 00:01 1387 /lib/x86_64-linux-gnu/libc.so.6 7f87d7500000-7f87d76bd000 r-xp 00000000 00:01 1387 /lib/x86_64-linux-gnu/libc.so.6 7f1a725a0000-7f1a7275d000 r-xp 00000000 00:01 1387 /lib/x86_64-linux-gnu/libc.so.6 Grsecurity/PaX is also affected. To check it, we have patched the Linux kernel 3.14.27 with Grsecurity/PaX. Some kernel configurations selected are: Security as highest priority (Security vs Performance) Server usage type Automatic configuration method The following output is the run on AMD the same processor (Opteron 62xx class CPU) under a i386 Linux 3.14.27 patched with PaX 3.1. The result is: $ for i in `seq 1 10`; do cat /proc/self/maps | grep "r-xp.*libc" ; done b7588000-b7736000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b7570000-b771e000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b75d0000-b777e000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b75b0000-b775e000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b7578000-b7726000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b7598000-b7746000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b7528000-b76d6000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b75b0000-b775e000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b7560000-b770e000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b75d0000-b777e000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 As shown in the previous outputs, both the non-patched PaX and the patched one have the bits 12, 13 and 14 are always 0. In order to have more confidence about this hypothesis, we can run additional tests: Test 1 Non-PaX Kernel: $ for i in `seq 1 1000`; do cat /proc/self/maps |grep vvar |grep "[^08]000-"; done |wc -l 0 Grsecurity/PaX Kernel: $ for i in `seq 1 1000`; do cat /proc/self/maps |grep "r-xp.*libc" |grep "[^08]000-"; done |wc -l 0 In both cases the result is 0, which indicates that after 1000 executions the bits 12, 13 and 14 are always 0. Test 2 Non-PaX Kernel: $ for i in `seq 1 1000`; do cat /proc/self/maps |grep vvar |grep "[08]000-"; done |wc -l 1000 Grsecurity/PaX Kernel: $ for i in `seq 1 1000`; do cat /proc/self/maps |grep "r-xp.*libc" |grep "[08]000-"; done |wc -l 1000 At this point, we are pretty sure that our system is vulnerable. Impact The total entropy for the VVAR/VDSO, mmapped files and libraries of a processes are reduced by eight. The number of possible locations where the mapped areas can be placed are reduced by 87.5%. Only 5 random bits on i386. On 32-bit systems, for example, the entropy for libraries is reduced from 28 to 25, which means that libraries only have 32 different places where they can be loaded. Under this scenario, advanced techniques used by PaX to thwart brute force attacks (for example, force a delay on the process creation when a crash occurs) are no longer effective. The attackers need on average only 16 trials. 32-bit applications running on 64-bit systems (x86_32) are also affected. The entropy is approximately the same or a littler bit higher depending on the personality flags. Actually, these applications are at a similar risk than when running on 32-bit systems (i386). On 64-bit systems, the entropy for libraries is reduced from 228 to 225. The different places where the libraries can be mapped are reduced approximately form 33554432 to of 268435456. The number of possible locations where the objects can be mapped on average is reduced from 268 to 16 millions. The number of potential vulnerable users could be high because both, standard and Grsecurity/PaX Linux are affected. The bug The bug is caused by a hack to improve performance by avoiding cache aliasing penalties in the Family 15h of AMD Bulldozer processors. The hack which affects mmapped files is in the file arch/x86/kernel/sys_x86_64.c. Here, the arch_get_unmapped_area_topdown() function is defined and later it calls to vm_unmapped_area() with a pointer to a struct vm_unmapped_area_info as a parameter. This struct, among other fields, contains a mask used to align the returned address which will be used as a base address to map the file. Following is a snippet of the affected code: arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0, const unsigned long len, const unsigned long pgoff, const unsigned long flags) { struct vm_area_struct *vma; struct mm_struct *mm = current->mm; unsigned long addr = addr0; ... info.flags = VM_UNMAPPED_AREA_TOPDOWN; info.length = len; info.low_limit = PAGE_SIZE; info.high_limit = mm->mmap_base; info.align_mask = filp ? get_align_mask() : 0; info.align_offset = pgoff << PAGE_SHIFT; addr = vm_unmapped_area(&info); ... } The info.align_mask value is calculated by calling to get_align_mask() only if the filp is not null. In other words, if the process is mapping a file then the info.align_mask depends on the return value of get_align_mask(). This function is: /* * Align a virtual address to avoid aliasing in the I$ on AMD F15h. */ static unsigned long get_align_mask(void) { /* handle 32- and 64-bit case with a single conditional */ if (va_align.flags < 0 || !(va_align.flags & (2 - mmap_is_ia32()))) return 0; if (!(current->flags & PF_RANDOMIZE)) return 0; return va_align.mask; } The previous code returns a unsigned long mask which was initialized when the kernel booted and depends on the family processor. When the processor is an AMD F15h this value is 0x7000. Therefore, ASLR depends on the specific processor. Reading the log comments it seems that kernel developers were aware of this issue (the reduction on the ASLR entropy): This patch provides performance tuning for the "Bulldozer" CPU. With its shared instruction cache there is a chance of generating an excessive number of cache cross-invalidates when running specific workloads on the cores of a compute module. This excessive amount of cross-invalidations can be observed if cache lines backed by shared physical memory alias in bits [14:12] of their virtual addresses, as those bits are used for the index generation. Maybe three bits less of entropy was not considered significant on 64bit systems. Unfortunately, on 32 bit systems, where the implementation of the ASLR is limited to 8 random bits it has a disastrous effect. Affected Systems The first affected Linux kernel version was 3.0.0, released on August 5, 2011. The following table summarizes the affected systems as well as the objects which are poorly randomized. Object 32-bit apps. 64-bit apps. Standard Linux kernel Libraries and files 28 25 228 225 VVAR/VDSO 28 25 218 215 Grsecurity/PaX Linux kernel Libraries and files 28 25 229 226 VVAR/VDSO 28 229 Table: Effective entropy for (affected non-affected) systems Exploit It is not necessary to do anything to exploit this issue. Every launched application in the system using the ASLR will be affected by this issue. Any attempt to guess where the VVAR/VDSO, library or any mmapped user file, by using brute force or trial and test attacks, requires eight times less attempts. Workaround Fortunately, the kernel provides a command line boot option to disable this behaviour. By setting the align_va_addr=off, the alignment is disabled and then the objects are loaded without losing entropy. If you have a 64bit system and you consider that 225 is a fairly large number (which depends on your threats and attack vectors), then you may set align_va_addr=64, which forces the page alignment only when running 64bits applications, and lets 32bit apps unaffected. Fix It is possible to have both: performance and security at once. Rather than resetting the conflicting bits [12..14], those bits can be set to a random value for all shared pages. At boot time, a random value in the range 0 to 7 is computed and stored. Then, all the bits [12..14] of all shared pages are set to that computed value. Since all the shared pages have the same value at the bits [12..14], there is no cache aliasing problems (which is supposed to be the cause of performance loss). On the other hand, since the value is not known by a potential remote attacker, the ASLR preserves its effectiveness. This type of ASLR model is known as per-boot randomization, which is pretty close to the way the ASLR is implemented in Windows(r) and Mac OS(r). With this fix, the affected addresses will have a mix of ASLR types: "per-process" ASLR for all bits but [12..14], and "per-boot" ASLR in [12..14] bits. Patch: [ 0001-mm-x86-AMD-Bulldozer-ASLR-fix.patch ] Patching Linux: $ cd linux.git $ wget http://hmarco.org/bugs/patches/0001-mm-x86-AMD-Bulldozer-ASLR-fix.patch $ git apply 0001-mm-x86-AMD-Bulldozer-ASLR-fix.patch Discussion Although it could be argued that this loss of entropy is justified in order to avoid some performance penalties, we should consider the following: Michael Larabel pointed that "... was not any measurable difference in the Linux tests". It seems that this performance penalties appears only in a very specific (rare) workloads. So, maybe "align_va_addr" should be disabled by default (currently it is enabled by default on the affected processors) We believe that the supposed performance increment of the align_va_addr hack does not justify the trade-off in security of the current implementation. Introducing hacks in the code that handles memory maps, may have side effects which are difficult to identify and debug (the align_va_addr hack only manifest when running applications on the specific CPU). For example, Grsecurity/PaX is vulnerable even when configured in Security (more than performance) mode. The systems, specially the 32-bit ones, are severely jeopardized. Thanks to the proposed fix, it is possible to have both performance and security. We recommend to apply the proposed fix to the kernel source or to disable the virtual address alignment (align_va_addr=off) on any Internet server equipped with an AMD Bulldozer CPU.

References:

https://lkml.org/lkml/2015/3/27/252
http://hmarco.org/bugs/AMD-Bulldozer-linux-ASLR-weakness-reducing-mmaped-files-by-eight.html
http://hmarco.org/bugs/patches/0001-mm-x86-AMD-Bulldozer-ASLR-fix.patch


Vote for this issue:
50%
50%


 

Thanks for you vote!


 

Thanks for you comment!
Your message is in quarantine 48 hours.

Comment it here.


(*) - required fields.  
{{ x.nick }} | Date: {{ x.ux * 1000 | date:'yyyy-MM-dd' }} {{ x.ux * 1000 | date:'HH:mm' }} CET+1
{{ x.comment }}

Copyright 2024, cxsecurity.com

 

Back to Top