Syscall filtering is a security mechanism that allows applications to define which syscalls they should be allowed to execute.
- Name: Cole Robinson
- Email: firstname.lastname@example.org
- Name: Paul Moore
- Email: email@example.com
- Targeted release: Fedora 18
- Last updated: August 16, 2012
- Percentage of completion: 100%
The syscall filtering concept, and the motivation behind it, is fairly simple; the Linux Kernel supports a very large number of system calls (syscalls), over 300 on x86_64 for the 64 bit implementations alone, with applications typically requiring only a very small subset of these syscalls to function normally. Through the use of syscall filters we can disable certain syscalls on an application by application basis, limiting the potential attack surface of the kernel and reducing the likelihood that a malicious application could exploit a kernel vulnerability.
The Linux Kernel's enhanced/mode-2 seccomp functionality is designed to allow applications to specify a filter that is applied to their own syscalls; the filter can specify just the syscall itself or the syscall in conjunction with a specific set of arguments. The kernel's seccomp filter API is the Berkley Packet Filter (BPF) language, the same as used in the Linux socket filters, but adapted for use with syscalls. The libseccomp library adds an abstraction layer on top of the kernel's seccomp API, allowing application developers a more user-friendly API based on function calls and not the BPF assembly language.
Benefit to Fedora
Increased resistance to exploiting kernel vulnerabilities from applications which implement seccomp based syscall filtering.
- Get seccomp into upstream kernel: DONE, present in 3.5-rc1
- Package libseccomp for Fedora: DONE, present in Fedora Rawhide BZ 830992
- Get the QEMU/libseccomp patch accepted upstream: DONE, present in 1.2-rc0
- Update Fedora QEMU package to build against libseccomp: NOT DONE
How To Test
- The traditional kernel regression tests should be preformed to ensure that the kernel's seccomp functionality does not impact the expected functionality when not enabled by the application at runtime. Requires Linux >= 3.5 built with CONFIG_SECCOMP_FILTER enabled.
- The libseccomp sources contain a series of automated tests which can be used to test the library's internal seccomp filter generation. It is important to note that these automated tests are tested via a seccomp BPF simulator and not the kernel.
- A simple negative test could be developed to validate that libseccomp and the kernel perform as expected when a syscall is blocked.
- The traditional QEMU regression tests should be performed to ensure that QEMU's normal functionality is not impacted by the libseccomp patches. Requires libseccomp >= 1.0.0 and QEMU 1.2. QEMU should be built with the "--enable-libseccomp" flag and run with the "-sandbox on" command line option.
Ideally this feature shouldn't be noticeable to the user, the syscall filtering should allow normal execution of the application. Intention is that only people trying to exploit security holes notice that the syscall they are trying to use is blocked :)
- Kernel updated to 3.5
- libseccomp included in Fedora
- QEMU upstream includes support for libseccomp
Applications other than QEMU wishing to use libseccomp only require the kernel and libseccomp support items listed above.
Since this is brand new functionality, if it doesn't make it in time for F18, nothing has changed. We just drop this feature page.
- https://lwn.net/Articles/494252/ (article about syscall filtering)
- http://libseccomp.sf.net/ (helper library)
- https://lists.gnu.org/archive/html/qemu-devel/2012-05/msg00623.html (initial QEMU libseccomp patch posting)
- The libseccomp library is now available, which provides applications with an easy way to reduce the potential damage of exploits, leveraging kernel syscall filters. Virtual machines benefit from this as QEMU/KVM now uses libseccomp.