Python: Optional Bytecode Cache
The Python standard library bytecode cache files (e.g.
/usr/lib64/python3.9/.../__pycache__/*.pyc) will be moved from the
package to three new optional subpackages (split by optimization level). The non-optimized bytecode cache will be recommended by
and installed by default but removable. The bytecode cache optimization level 1 and 2 will not be recommended (and hence will not be installed by default) but will be installable. The default SELinux policy will be adapted not to audit AVC denials when the bytecode cache is created by Python at runtime. This will save 8.89 MiB disk space on default installations or 17.12 MiB on minimal installations (by opting-out from the recommended subpackage with non-optimized bytecode cache). When all three new packages are installed, the size will increase slightly over the status quo (by 4.5 MiB).
- Name: Miro Hrončok
- Name: Lumír Balhar
- Name: Tomáš Orsava
- Name: Lukáš Vrabec (selinux-policy maintainer)
- Email: email@example.com
- Targeted release: Fedora 34
- Last updated: 2020-09-08
- FESCo issue: <will be assigned by the Wrangler>
- Tracker bug: <will be assigned by the Wrangler>
- Release notes tracker: <will be assigned by the Wrangler>
What is the Python bytecode cache
When Python code is interpreted, it is compiled to Python bytecode. When a pure Python module is imported for the first time, the compiled bytecode is serialized and cached to a
.pyc file located in the
__pycache__ directory next to the
.py source. Subsequent imports use the cache directly, until it is invalidated (for example when the
.py source is edited and its
mtime stamp is bumped) -- at that point, the cache is updated. This behavior is explained in detail in PEP 3147. The invalidation is described in PEP 552.
Python can operate in 3 different optimization levels: 0, 1 and 2. By default, the optimization level is 0. When invoked with the
-O command line option optimization is set to 1, similarly with
-OO it is 2. Bytecode cache for different optimization levels is saved with different filenames as described in PEP 488.
As an example, a Python module located at
/path/to/basename.py will have bytecode cache files for CPython 3.9 stored as:
/path/to/__pycache__/basename.cpython-39.pycfor the non-optimized bytecode
/path/to/__pycache__/basename.cpython-39.opt-1.pycfor optimization level 1
/path/to/__pycache__/basename.cpython-39.opt-2.pycfor optimization level 2
Python bytecode cache in RPM packages (status quo)
Pure Python modules shipped in RPM packages (and namely the ones shipped trough the
package) are located at paths not writable by regular user, under
/usr/lib(64)/python3.9/, hence the bytecode cache is also located in such locations. To work around this problem, the bytecode cache is pre-compiled when RPM packages are built and
ships and owns the sources as well as the bytecode cache:
$ rpm -ql python3-libs ... /usr/lib64/python3.9/__pycache__/ast.cpython-39.opt-1.pyc /usr/lib64/python3.9/__pycache__/ast.cpython-39.opt-2.pyc /usr/lib64/python3.9/__pycache__/ast.cpython-39.pyc ... /usr/lib64/python3.9/ast.py ...
As a result, the package is quite big, essentially shipping all pure Python modules 4 times.
Depending of the module content, its bytecode cache files might be identical across optimization levels. For such cases, the files are hardlinked to reduce the bloat:
$ ls -1i /usr/lib64/python3.9/collections/__pycache__/abc.*pyc 8634 /usr/lib64/python3.9/collections/__pycache__/abc.cpython-39.opt-1.pyc 8634 /usr/lib64/python3.9/collections/__pycache__/abc.cpython-39.opt-2.pyc 8634 /usr/lib64/python3.9/collections/__pycache__/abc.cpython-39.pyc
$ ls -1i /usr/lib64/python3.9/__pycache__/ast.*pyc 8438 /usr/lib64/python3.9/__pycache__/ast.cpython-39.opt-1.pyc 8440 /usr/lib64/python3.9/__pycache__/ast.cpython-39.opt-2.pyc 8441 /usr/lib64/python3.9/__pycache__/ast.cpython-39.pyc
What if the bytecode cache would not be packaged
When the bytecode cache is not packaged, several things happen:
- When non-root users run Python, the imported modules are never cached. As a result, the startup time of Python apps might be slightly larger than necessary until root runs them.
- When root runs Python, the imported modules are cached. As a result untracked
.pycfiles start to pop up in
/usr/lib(64)/python3.9/. When the system is updated to a newer Python version, the untracked files remain on the filesystem until manually cleaned up.
- When root runs Python in SELinux restricted context, the imported modules are attempted to be cached but SELinux does not allow that. The result is same as (1) with a lot of noise from SELinux.
Packaging the bytecode cache into optional subpackages
To be able to save quite some disk space without disrupting the user experience, we propose to ship the pre-compiled bytecode cache previously included in
- Pre-compiled non-optimized bytecode cache files (
*.cpython-39.pyc) will be packaged in
- Pre-compiled level 1 optimized bytecode cache files (
*.cpython-39.opt-1.pyc) will be packaged in
- Pre-compiled level 2 optimized bytecode cache files (
*.cpython-39.opt-2.pyc) will be packaged in
Given that almost all Fedora Python packages invoke Python in the non-optimized mode¹,
and hence the package will be installed by default together with Python; the user experience will remain the same for the vast majority of users and use cases.
Note that by splitting the three optimization levels to different RPM packages, files can no longer be hardlinked between each other. This results in a slight size increase when all three optimization levels are installed. The change owners consider the need for all three subpackages to be present simultaneously on one size-sensitive system unlikely and hence consider this a fair trade.
¹ No real data was collected to support this claim. This hypothesis is made by the Python maintainers based on their own experience.
SELinux policy changes
In order to suppress the otherwise omnipresent AVC denial messages about Python failing to write the bytecode cache, the Python maintainers have teamed up with the Fedora's selinux-policy maintainers to suppress those. The implementation details about this are available at:
When Python runs under the root user in SELinux restricted context, SELinux will still prevent it from writing the bytecode cache, but it will not clutter the audit log.
Sizes calculated in
mock on x86_64. Only
and the relevant bytecode cache packages were installed (i.e. no other Python packages). By
du -c /usr/lib64/python3.9 (converted to MiBs by dividing by 1024 and rounding to 2 decimal places).
||Difference in MiB||Difference in %|
|Status quo (before this change)||31.84 MiB|
|Default (non-optimized cache only)||22.96 MiB||-8.89 MiB||-27.91 %|
|No cache||14.72 MiB||-17.12 MiB||-53.77 %|
|Non-optimized cache and optimization level 1||29.71 MiB||-2.13 MiB||-6.70 %|
|All optimization levels (same files as status quo)||36.35 MiB||+4.50 MiB||+14.14 %|
The presence or absence of the bytecode cache only impacts the speed of imports. It is most common that the imports happen while an application starts. Once the application is running, there is no speed difference.
A totally inappropriate and unscientific experiment:
$ du -a /usr/lib64/python3.9/ | grep py$ | sort -n -r | head -n 1 224 /usr/lib64/python3.9/_pydecimal.py
$ time python3 -c 'import importlib as i, _pydecimal as p; [i.reload(p) for _ in range(10000)]' real 0m13.986s user 0m13.554s sys 0m0.365s
$ time python3 -O -c 'import importlib as i, _pydecimal as p; [i.reload(p) for _ in range(10000)]' real 0m13.594s user 0m13.186s sys 0m0.337s
$ time python3 -OO -c 'import importlib as i, _pydecimal as p; [i.reload(p) for _ in range(10000)]' real 0m13.225s user 0m12.855s sys 0m0.290s
$ time python3 -c 'import importlib as i, _pydecimal as p; [i.reload(p) for _ in range(10000)]' real 4m20.554s user 4m14.600s sys 0m4.850s
$ time python3 -O -c 'import importlib as i, _pydecimal as p; [i.reload(p) for _ in range(10000)]' real 4m14.291s user 4m9.333s sys 0m3.721s
$ time python3 -OO -c 'import importlib as i, _pydecimal as p; [i.reload(p) for _ in range(10000)]' real 4m14.816s user 4m11.035s sys 0m2.400s
This suggests that an application that does 10000 module imports (with rather large 224 KiB modules) would be slowed down on start by 4 minutes. Obviously, such measurements depend on many aspects and doing 10000 imports is rather far-fetched -- there are only ~550 pure Python modules in
, hence even if they all are imported, the slowdown should not exceed couple seconds. However, it is indisputable that importing modules without the cache is significantly slower.
Deployments negatively impacted by this are advised to either install the appropriate bytecode subpackage or pre-compile the relevant modules ahead of time (e.g. when building a container image).
In this section, we briefly describe ideas that were presented by others or considered by the change owners, but rejected.
Stop shipping mandatory
.py sources, ship only
This is described as Solution 7 in the abovementioned document.
It is possible to ship bytecode cache installed e.g. as
/usr/lib64/python3.9/ast.pyc instead of the appropriate Python source (e.g.
/usr/lib64/python3.9/ast.py in our example). Such solution could save additional 3.1 MiB when no sources and no other optimization levels would be shipped. In addition, it would also not suffer from any import slowdowns regardless of the optimization level Python was invoked with. Our analysis however shows significant drawbacks when shipping
.pyc files only:
- Without the source codes, some Python tracebacks are less informational.
- Without the source codes, many IDEs and other Python developer tools might be confused (e.g. code completion or the IPython
??syntax to show a function's code).
- Sysadmins and ops are notoriously known to edit Python source files (including the standard library) on production 🎩.
- The shipped
.pycfiles would need to be compiled with a certain optimization level (presumably 0, without loss of generality). Python invoked with
-OOwould still execute such
.pycfile regardless -- in special circumstances, this can lead to slight behavior nuances that would be very hard to debug.
This can be somehow worked around by offering the possibility to install the sources (or even recommend them by default). However:
module.pycfiles is always totally ignored.
module.pywould then still need to ship various optimization level bytecode caches in the
__pycache__directory, possibly duplicating
__pycache__/module.cpython-39.pyc(there is no way to hardlink those on RPM level, since they are in different directory).
- Just by installing the optional sources on production in order to be able to debug problems, Python invoked in different optimization level than the one used to pre-compile the shipped
module.pycwould suddenly start executing different bytecode, essentially producing a haisenbug waiting to happen.
The change owners consider shipping individual well-tested, large, modules with machine-generated sources as
.pyc only to save space a reasonable compromise (and it was already done with
pydoc_data.topics and several
encodings submodules). However doing it with the entire standard library would provide a very non-standard user experience and might possibly blow up at many places.
Make Python not attempt to write bytecode cache into
Originally, when panning this change, the idea was to prevent Python to write the bytecode cache to
/usr/lib(64)/python3.9/... on imports by a special marker file (e.g.
/usr/lib(64)/python3.9/nocache or similar). Such marker would mean "this directory (possibly recursively) contains bytecode maintained by non-Python tooling". Python would use the cache if present, but it would not even attempt to write the cache if it is missing or outdated. Such change of behavior would need to be introduced in Python upstream.
When drafting this change for upstream, we have failed to provide sufficient reasoning to introduce such new behavior. The only problems with writing the cache identified by the change owners were:
- files not owned by any RPM package (solved by
- SELinux AVC denials noise (solved by adapting
Hence, this idea was rejected.
When all three bytecode packages are installed, the disk usage is slightly bigger than before this change because files in different packages cannot be hardlinked on RPM level. In theory, it might be possible to compensate this by various means. For example:
- Hardlink the files in
- On build time, detect what files are identical, hardlink such files and include all their instances in all relevant packages (e.g. all three bytecode packages would contain all three hardlinked versions of
__pycache__/abc.cpython-39*.pyc, but only one version of
Such solutions are considered not worth it by the change owners, because they would make the user experience and/or the packaging unnecessary complicated. As said previously, the change owners consider the need for all three subpackages to be present simultaneously on one size-sensitive system unlikely.
Not realized ideas
In this section, we briefly describe ideas that were presented by others or considered by the change owners, but were not realized (e.g. for capacity reasons). Such ideas may be realized later.
Store bytecode cache in
The change owners feel that runtime created cache should not be stored in
/usr/lib(64) but rather
~/.cache. Since Python 3.8 it is possible to set the
PYTHONPYCACHEPREFIX environment variable to modify the location of the bytecode cache.
Changes could be proposed to Python upstream to default for cache in cache-specific filesystem paths (with paths priorities etc.), but this idea is not being pursued at this time.
Apply this change to all Python RPM packages
The motivation and implementation of this change proposal can be in theory extended to all Python RPM packages in Fedora. However, since this is a tad tedious from the packaging perspective, an automation that would significantly free the packagers away from the technical details would need to be created. If this is desired, the idea can be pursued later (possibly once this change is battle tested).
Many Python minimization proposals were previously described in great detail in Python minimization in Fedora and discussed on the devel mailing list. This proposal is also based on feedback received there.
Benefit to Fedora
- In the default scenario, 8.89 MiB disk space saved.
- In the minimal scenario, 17.12 MiB disk space saved.
- Bandwidth saved on dnf upgrades as well (both on clients and mirrors).
- Proposal owners:
- Update the Python package, see https://src.fedoraproject.org/rpms/python3.9/pull-request/29
- Update selinux-policy, see https://github.com/fedora-selinux/selinux-policy/pull/404
- Other developers: N/A (not a System Wide Change)
- Release engineering: N/A (not a System Wide Change)
- Policies and guidelines: N/A (not a System Wide Change)
- Trademark approval: N/A (not needed for this Change)
- Alignment with Objectives: Fedora Minimization Objective
Fedora installations upgrading to this change (either rawhide users or on distro upgrades to Fedora 34) will be impacted slightly. Users will end up with
installed, but all the previously included bytecode cache files for optimization levels 1 and 2 will remain present unless manually removed or until Python is upgraded in Fedora to 3.10+. Such files will remain outdated and hence most likely become invalid and useless over time.
Hence the change owners have prepared a one-time-off scriptlet that would clean the files on the first upgrade to the new configuration to prevent this from happening. Such scriptlet can be removed in Fedora 36 or sooner when Fedora upgrades to Python 3.10+.
How To Test
Users of minimal systems where the maintainers removed
may notice slight slowdown of Python or Python applications startup and may need to install
if they are concerned. Alternatively, they may bytecompile only the relevant parts of the Python standard library ahead of time (e.g. when building derived container images). Users who prefer the bytecode not to be there may want to build their container images with
PYTHONDONTWRITEBYTECODE environment variable set.
N/A (not a System Wide Change)
- Contingency mechanism: Abort, abort!
- Contingency deadline: before the beta freeze
- Blocks release? Not a chance
- Blocks product? It plocks a broduct