Enable fs-verity in RPM

Summary

Enable the use of fsverity for installed RPM files validation.

Owners

Name: Davide Cavalca, Boris Burkov, Filipe Brandenburger, Michel Alexandre Salim, Matthew Almond
Email: dcavalca@fb.com, borisb@fb.com, filbranden@fb.com, michel@fb.com, malmond@fb.com

Current status

Targeted release: Fedora Linux 37
Last updated: 2022-02-14
devel thread
FESCo issue: #2711
Tracker bug: <will be assigned by Wrangler>
Release notes tracker: <will be assigned by Wrangler>

Detailed description

fs-verity is a Linux kernel feature that does transparent on-demand integrity/authenticity verification of the contents of read-only files, using a hidden Merkle tree (hash tree) associated with the file. The mechanism is similar to dm-verity, but implemented at the file level rather than at the block device level.

When fsverity is enabled for a file, the kernel reads every block and generates a hash tree for it, which is stored within the filesystem. On subsequent reads, the kernel computes the block hash and compares it with the one stored in the tree, protecting against alterations and corruption. Because this happens at the filesystem data block read layer, it encompasses all file operations (open, mmap,exec, etc.).

In the context of rpm, there are two parts to this:

at build time, we compute the Merkle tree for the files within a package, then sign it and ship it as part of the rpm metadata;
at run time, if the fsverity rpm plugin is enabled, rpm will install the fsverity signature key and enable fsverity on files that are installed.

This proposal is primarily concerned with the first part, which will make it possible for users to leverage fs-verity for RPM if they so desire. Specifically, installing and enabling the fs-verity rpm plugin by default is explicitly considered out of scope here.

Caveats

Merkle tree cost

The Merkle tree used by fsverity needs to be generated (once at build time, once when the package is installed) and stored on disk. The generation process involves reading all blocks and computing the hash, which has a non-trivial cost; however, it does not appear to meaningfully slow down package installs during empirical testing. Once generated, the Merkle tree will use up some disk space for its storage (about 1/127th of the original file size). Note that the Merkle tree is not shipped with the RPM itself (only its signature is) and is only generated and stored at install time if the fsverity rpm plugin is enabled. Hence, there is no cost (neither in generation time nor in disk space usage) if the plugin is disabled.

Signature overhead cost

To leverage fsverity every rpm needs to include the hash signature as part of its metadata, which will increase its size. The signature size is roughly proportional to the number of files in the package. From empirical testing, in the vast majority of cases we expect to see minimal to no size increase thanks to RPM header packing.

Relationship with IMA

IMA is another technology meant to provide detection of file alterations. IMA and fsverity operate very differently, and are somewhat complementary.

fs-verity works by using a Merkle tree to generate a checksum for every data block in the system, and reads will fail if a single data block read fails it’s checksum. The signature of the the file is validated against a public key loaded into the kernel keyring. Because fsverity operates on block reads, its runtime cost is small (as it only needs to verify the block that is being accessed), and it can protect from alterations at any point in time.

IMA works by measuring a file as a whole and comparing its signature whenever it’s read of executed. It has a higher runtime cost than fsverity (as it needs to verify the whole file at once) and it cannot detect subsequent alternations. IMA provides a much more rich and complex policy system, allowing one to define system-wide policies around trusted files that tie into LSMs such as SELinux.

IMA and fsverity could potentially be integrated (meaning, an fsverity backend for IMA could be implemented to leverage its policy controls), but this is not currently planned or being worked on.

Relationship with native checksums

By default, btrfs already checksums each file extent, which could potentially be leveraged to implement a HMAC solution. This currently exists as a patch series but it hasn’t been merged yet. Similarly to IMA, we see this approach as complementary to fs-verity. The blog post goes into more details of the tradeoffs involved.

Feedback

Do fs-verity and IMA use the same per-file signature metadata in the RPMs?

Both fs-verity and IMA use file signatures, but they each have their own dedicated flags and signing flows in RPM. The signatures themselves are not interchangeable -- fs-verity's signature is based off the Merkle tree (which itself is block-based), while IMA measures the file as a whole.

Can you explain how the signature is performed?

The top-level hash is calculated for each file, then that hash is signed with the inputted rsa key pair and the signed hash is appended to the array of signed hashes in the rpm metadata. The "signature" is actually one rpm metadata item that's an array of the signatures of all files.

fs-verity the kernel feature operates on a per-file basis, and since the ultimate goal is to deliver fs-verity enabled files on the installer's system, we need each file's signature in the rpm. At install, we call the fs-verity enable ioctl for each file, passing in its signature to make use of the kernel authentication functionality.

Where those the signing actually happens?

As part of the package signing flow (e.g. via rpmsign during package build), the Merkle tree is generated and a signature is computed from it, which is then added to the rpm metadata.

Is this signature key the Fedora rpm package signing key?

fs-verity needs a dedicated RSA key/cert pair for file signing at package signature time. At package install time, the cert needs to be loaded in the appropriate kernel keyring.

Does the checksumming apply to every data block?

fs-verity only operates on files where it has been enabled via its ioctl (which, if you install the RPM plugin, is taken care of by RPM on your behalf). For those, fs-verity will checksum every data block whenever it's accessed and validate it still matches.

Is this related to the dm_verity kernel module?

It's somewhat inspired by dm-verity, but it's a separate implementation, the only shared logic is the hash computation code in the kernel.

What about unsupported filesystems? Is there XFS support?

fs-verity requires support in the underlying filesystem. If you're using a filesystem that doesn't support it and attempt to enable fs-verity on a file, the ioctl will fail. Note that this is only a concern at runtime, not at build time. XFS doesn't support fs-verity at the moment, but it could be implemented if one wanted to.

Who is going to implement the koji/robosignatory integration?

The Change owners.

Are there some test runs with numbers to show before/after data for both the RPM size and installed FS usage?

The Change owners are currently collecting this data and will be releasing it in the coming weeks.

What would most Fedora users use this for or benefit from it?

Broadly speaking, fs-verity makes it possible to ensure that files that were installed via an RPM have not been modified. It is useful in environments where an attacker might be able to modify system files (say, replace /bin/ls with a compromised version) and you want to protect against that. For example, consider an appliance-like system placed in an untrusted location where you may not be able to control who has physical access (this could be a server, but it could also be a kiosk in an internet point or a school). In this scenario, fs-verity can be one of the building blocks to ensure and maintain system trust.

This Change is mostly about putting in place the necessary plumbing for this to be at all possible.

Can you elaborate on the threat model? How is RPM able to update files?

Once fs-verity is enabled for a given file (which, in the RPM case, happens at package installation time), it cannot be disabled, and the file becomes immutable. One can still rename() or unlink() it (and this is indeed how rpm is able to replace files when upgrading packages), but the actual contents cannot be altered.

Where is this useful? For example, fs-verity can help in the scenario where an attacker has out-of-band access to the storage device (say, they pull a hard drive from a colo'd server or a sdcard from an embedded device, or they boot into a liveusb, or they access a VM image directly from a host).

Let's say that happens, and the attacker changes a few blocks of /bin/ls on the device to make it run nefarious code. When you boot your system again, it would fail at exec() time because the Merkle tree wouldn't match.

Let's say that instead the attacker mounts (or gains access to) your filesystem, unlinks /bin/ls and replaces it wholesale with a new copy (hence creating a new inode). The attacker doesn't have your signing key, so they can't resign the file and enable fs-verity on it (they could resign it with their own key, but unless they can then find a way to load its cert into the kernel keyring it won't do much good). To protect against this, you now have a few options:

you could use a LSM to enforce that exec() can only happen on files with valid fs-verity signatures; this would protect any binary
you could use a launcher booted from secure storage (say, a dm-verity volume, which could even be the initrd), and have this launcher perform the verification; this of course only protects against binaries executed from the launcher, but depending on your threat model it might be enough

Like most security solutions, this isn’t a silver bullet and it’s not something that in and of itself would necessarily prevent all possible attacks. However, fs-verity can be a useful building block in a defense-in-depth approach against specific attacks, depending on your threat model.

What is the expected user experience in the event of a RPM fs-verity mismatch/error?

There are multiple scenarios here:

a file or signature in the rpm is corrupted, the signature doesn't have a matching cert installed, etc...

in this case, if the plugin is present, when you attempt to install the rpm the verity enable ioctl will explicitly fail, and presumably so will the rpm install

after installation, a file from an fs-verity enabled rpm gets one or more blocks corrupted

The first read of a corrupted block from disk (the good uncorrupted page might survive in page cache for a while) will result in EIO for read-like system calls and SIGBUS if the file is mapped (executables, mmap).

If the verity metadata (signature, root hash) is corrupted after installation but before the file is opened, then opening/exec-ing the file can fail. Also, if pages from a binary read in during the exec itself are corrupted, the system call itself could fail (rather than the process getting sigbus like for a random page during execution)

Errors at installation time should be fully diagnosable, and even if the output today doesn't make it totally obvious what happened, it would be easy to fix in rpm.

The errors post-install are a bit trickier. Imagine you install your rpm, and kick off some long running daemon from it. A month later, a block gets corrupted in a way fs checksums don't catch (e.g. ext4, btrfs nodatasum, evil maid), and suddenly that daemon receives a SIGBUS and crashes. You would be able to see clearly that it was a verity issue in dmesg, but I don't think the binary could reasonably know what happened or write a meaningful log. In that sense, I think it's actually pretty similar to the experience if you have corruption in your disk and start getting btrfs checksum errors on a file--you'd have to look in dmesg to know why your file is broken.

The middle ground is when opening/exec-ing the file fails. In that case, you might get a sufficiently specific error code you could figure out it's verity, and the full error would be in dmesg as well.

How does this relate to RPM files validation?

With RPM, the validation only happens on install time, and when one runs rpm -V manually. With fs-verity, the validation happens on-demand whenever a block of a file that originated from an RPM is accessed. This means, for example, that if an attacker replaces /bin/ls on disk with a compromised one, the next time it's read from disk (e.g. because you ran it) you will see a validation failure and the syscall will be blocked, preventing the compromised code from being executed.

Won't this waste filesystem space? How do I reclaim it if I stop using this?

Unless you install rpm-plugin-fsverity (which is not and will not be installed by default), there is no disk space increase for verity-signed RPM packages. If you do install rpm-plugin- fsverity, some disk space will be used for the Merkle tree. To reclaim space, you could reinstall the rpm, the fs will reclaim the verity metadata along with the rest of the old file.

Can the user modify a file shipped by a package (e.g. to edit a script while debugging) ?

Once fs-verity is enabled on a file, it becomes immutable and its contents can never be modified. The user is always able to unlink() and replace it with another file, which can then be modified at will. If fs-verity verification is desired on the replacement as well, the user can sign it with their own key and load it in the fs-verity kernel keyring so that validation will pass. At this point, the replacement also becomes immutable.

Benefit to Fedora

The main benefit is the ability to do block-level verification of RPM-installed files. In turn, this can be used to implement usecase-specific validation and verification policies depending on the environment requirements.

Scope

Proposal owners
- btrfs kernel enablement work (landed in 5.15); see this blog post for more details
- koji integration: koji will need to add the fs-verity metadata to packages when signing them
Other developers:
- deploy the koji integration changes to production
Release engineering: https://pagure.io/releng/issue/10418
Policies and guidelines: N/A
Trademark approval: N/A

Upgrade/compatibility impact

None

How to test

Install the fs-verity RPM plugin to validate package contents:

$ sudo dnf install rpm-plugin-fsverity

Note that this will only be useful if the packages being installed contain the appropriate fs-verity metadata (which, for Fedora upstream packages, requires Koji integration that is part of this Change). However, you should still be able to test this if you locally sign a package with rpmsign --addverity.

User experience

This Change is fully transparent and there is no user impact by default. If the user chooses to enable the fs-verity RPM plugin, they can then leverage the additional verification features provided by fs-verity.

Dependencies

fs-verity support is available in RPM as of 4.17, which is available as of Fedora 35 and is already enabled in rpm-4.17.0-0.rc1.0.fc36
CONFIG_FS_VERITY in the kernel config; this is already enabled
fs-verity requires filesystem support; currently support for ext4 and f2fs is already available; support for btrfs landed in 5.15
there is no filesystem dependency on the builders, only at runtime (and only if the rpm fsverity plugin is installed and one wishes to use it)

Contingency plan

Revert the changes to koji.

Documentation

https://www.kernel.org/doc/html/latest/filesystems/fsverity.html
https://developers.facebook.com/blog/post/2021/10/19/fs-verity-support-in-btrfs/
The proposal owners plan to document the fsverity plugin and integration in RPM (https://github.com/rpm-software-management/rpm/issues/1849)

Release Notes

The RPM package manager now supports validation of file contents using fs-verity.

Search

Changes/FsVerityRPM

Contents