kNoX - Linux kernel security patch ------------------------------------ ========== Overview ========== This patch is a collection of security-related features for the Linux kernel, all configurable via the new 'Security options' configuration section. In addition to the new features, some versions of the patch contain various security fixes. The number of such fixes changes from version to version, as some are becoming obsolete (such as because of the same problem getting fixed with a new kernel release), while other security issues are discovered. Non-executable user memory pages ---------------------------------- Most today's exploits use many techniques to modify EIP register value to point to some arbitrary code which is often put either on the stack or some other writable memory area like the heap or BSS. If the stack or other writable areas are executable, the code is then executed leading to process privileges compromise. This patch provides emulation of page non-executable protection on x86 architecture which lacks of hardware support. This feature higly increases system security without impact on system performance. Additionally, this patch forces writable memory mappings to be non-executable. Another way to exploit common vulnerabilities is to point the return address to a function in libc, usually system(). This patch also changes the default address that shared libraries are mmap()'ed at to make it always contain a zero byte. This makes it impossible to specify any more data (parameters to the function, or more copies of the return address when filling with a pattern), -- in many exploits that have to do with ASCIIZ strings. However, note that this patch is by no means a complete solution, it just adds an extra layer of security. Many vulnerabilities will remain exploitable in a more complicated way, i.e. by executing code of loaded libraries or program itself with some arbitrary arguments. These exploit techniques needs to be prevent by another security solution that can't be fixed by any kernel patch and needs to be done in user-level. Also, note that most vulnerabilities can be used for denial of service attacks (usually in non-respawning daemons and network clients). A patch like this cannot do anything against that. It is important that you fix vulnerabilities as soon as they become known, even if you're using the patch. The same applies to other features of the patch (discussed below) and their corresponding vulnerabilities. Autodetect and emulate GCC trampolines ---------------------------------------- This option is available only if above ,,Non-executable user memory pages'' option is enabled. GCC generates trampolines on the stack to correctly pass control to nested functions when calling from outside. Normally, this requires the stack being executable. When this option is enabled, the kernel will trap faults resulting from trampoline calls, and will emulate the trampolines. However, in some cases this autodetection can be fooled in a buffer overflow exploit, so, if you've got no programs that use GCC trampolines, it is more secure not to enable this feature. This is really required only if you are using glibc 2.0 or your system won't even boot. Restricted links in /tmp -------------------------- I've also added a link-in-+t restriction, originally for Linux 2.0 only, by Andrew Tridgell. I've updated it to prevent from using a hard link in an attack instead, by not allowing regular users to create hard links to files they don't own, unless they could read and write the file (due to group permissions). This is usually the desired behavior anyway, since otherwise users couldn't remove such links they've just created in a +t directory (unfortunately, this is still possible for group-writable files) and because of disk quotas. Unfortunately, this may break existing applications. Restricted FIFOs in /tmp -------------------------- In addition to restricting links, you might also want to restrict writes into untrusted FIFOs (named pipes), to make data spoofing attacks harder. Enabling this option disallows writing into FIFOs not owned by the user in +t directories, unless the owner is the same as that of the directory or the FIFO is opened without the O_CREAT flag. Restricted /proc ------------------ This was originally a patch by route that only changed the permissions on some directories in /proc, so you had to be root to access them. Then there were similar patches by others. I found them all quite unusable for my purposes, on a system where I wanted several admins to be able to see all the processes, etc, without having to su root (or use sudo) each time. So I had to create my own patch that I include here. This option restricts the permissions on /proc so that non-root users can see their own processes only, and nothing about active network connections, unless they're in a special group. This group's id is specified via the gid= mount option, and is 0 by default. (Note: if you're using identd, you will need to edit the inetd.conf line to run identd as this special group.) Also, this disables dmesg(8) for the users. You might want to use this on an ISP shell server where privacy is an issue. Note that these extra restrictions can be trivially bypassed with physical access (without having to reboot). When using this part of the patch, most programs (ps, top, who) work as desired -- they only show the processes of this user (unless root or in the special group, or running with the relevant capabilities on 2.2), and don't complain they can't access others. However, there's a known problem with w(1) in recent versions of procps, so you should apply the included patch to procps if this applies to you. Special handling of fd 0, 1, and 2 ------------------------------------ File descriptors 0, 1, and 2 have a special meaning for the C library and lots of programs. Thus, they're often referenced by number. Still, it is normally possible to execute a program with one or more of these fd's closed, and any open(2) calls it might do will happily provide these fd numbers. The program (or the libraries it is linked with) will continue using the fd's for their usual purposes, in reality accessing files the program has just opened. If such a program is installed SUID and/or SGID, then we might have a security problem. Enable this option to ensure that fd's 0, 1, and 2 are always open on startup of a SUID/SGID binary. If any of the fd's is closed, "/dev/null" will be opened for it (the device itself; you don't need to have /dev in the filesystem for that to work, such as in a chroot). This part of the patch is by Pavel Kankovsky, I've only ported it to Linux 2.2 (any errors are mine, of course). Enforce RLIMIT_NPROC on execve(2) ----------------------------------- Linux lets you set a limit on how many processes a user can have, via a setrlimit(2) call with RLIMIT_NPROC. Unfortunately, this limit is only looked at when a new process is created on fork(2). If a process changes its UID, it might exceed the limit for its new UID. This is not a security issue by itself, as changing the UID is a privileged operation. However, there're privileged programs that want to switch to a user's context, including setting up some resource limits. The only fork(2) required (if at all) is done before switching the UID, and thus doesn't result in a check against RLIMIT_NPROC. Enable this option to enforce RLIMIT_NPROC on execve(2) calls. (The Linux 2.0 version of this patch only checks the limit for processes that have their "dumpable" flag reset, such as due to an UID change, to reduce the performance impact.) Note that there's at least one good reason I am not enforcing the limit right after setuid(2) calls: some programs don't expect setuid(2) to fail when running as root. Destroy shared memory segments not in use ------------------------------------------- Linux lets you set resource limits, including on how much memory a process can consume, via setrlimit(2). Unfortunately, shared memory segments are allowed to exist without association with any process, and thus might not be counted against any resource limits. This option automatically destroys shared memory segments when their attach count becomes zero after a detach or a process termination. It will also destroy segments that were created, but never attached to, on exit from the process. (In case you're curious, the only use left for IPC_RMID is to immediately destroy an unattached segment.) Of course, this breaks the way things are defined, so some applications might stop working. In particular, expect most commercial databases to break. Apache and PostgreSQL are known to work, though. :-) Note that this feature will do you no good unless you also configure your resource limits (in particular, RLIMIT_AS and RLIMIT_NPROC). Most systems don't need this. Configuration via sysctl interface ------------------------------------ Sysctl allows to configure globally security features provided by this patch related to non-executable pages protection mechanism. This configuration is usefull in some rare cases (like installing Star Office which requires kNoX to be disabled temporarily during installation process). This option is for advanced administrators only. ================ How to install ================ Make sure you have the original kernel sources (as can be obtained from ftp.kernel.org) installed in /usr/src/linux. Apply the patch: cd /usr/src/linux patch -p1 < PATCH-FILE where PATCH-FILE is the full path and name of the linux-*-ow*.diff file. In kernel configuration, go to the new 'Security options' section. Read help for the suboptions, and configure them. If desired, edit /etc/fstab to specify the group id for accessing /proc. Also, make sure you have no extra procfs mount commands in the startup scripts, as these might override your fstab settings; this is the case for some distributions, including Red Hat. (Note that you won't be able to specify the GID by remounting /proc on a running system. This is because filesystem-specific options are not supported at that stage.) Build the kernel and reboot. You may also want to add the following line to your /etc/syslog.conf to log [security] alerts separately: kern.alert /var/log/alert Additionally, you may do something like this (assuming the log file will be empty most of the time): > /var/log/alert chown root.staff /var/log/alert chmod 640 /var/log/alert echo "less -XEU /var/log/alert" >> ~non-root/.bash_profile Ensure that the non-executable memory pages protection is working, using demo.c for that purpose. Compile it with gcc demo.c -o demo, then execute it without any option to display little help. Demo program takes only one option which specifies number of test to perform. You may use it to verify that the patch has been aplied and configured properly. Program's output tells wether or not the test was successful. You may also look at system logs for kernel error messages generated when page protection fault occurs. If you enabled the link-in-+t restriction, you can also try to create a symlink in /tmp (as a non-root user) pointing to a file that user has no read access to, then switch to some other user that has the read access (for example, root) and try to read the file via the link (such as, with 'cat /tmp/link'). This should fail, and a message should get logged. Now, you can try to create a hard link as a non-root user to a file that user doesn't own. This should also fail. -- Openwall patch by: Solar Designer http://www.openwall.com/linux/ Page non-executable protection (kNoX) idea and design by: Wojciech Purczynski http://cliph.linux.pl/kNoX/ Greetings go to Paul Starzetz for his RSX module. Special greetings go to: funkysh, y3t1, z33d