From Fedora Project Wiki
mNo edit summary
(Additions & major rewording)
Line 28: Line 28:
<!-- A sentence or two summarizing what this change is and what it will do. This information is used for the overall changeset summary page for each release. -->
<!-- A sentence or two summarizing what this change is and what it will do. This information is used for the overall changeset summary page for each release. -->


The current way of automatic Python byte compiling of files outside of Python specific directories is too magical and error prone. It is built on false premises that might have been logical years ago. We will provide a way to opt-out of it and adjust the guidelines to prefer the new way of handling the bytecompilation of such files. Later the old behavior will be opt-in only or will cease to exist.
The current way of automatic Python byte-compiling of files outside Python-specific directories is too magical and error-prone. It is built on heuristics that are increasingly wrong.
We will provide a way to opt-out of it and adjust the guidelines to prefer explicit bytecompilation of such files. Later, the old behavior will be opt-in only or will cease to exist.
 
Note that bytecompilation in Python-specific directories (e.g. <code>/usr/lib/python3.6/</code>) is not affected.


== Owner ==
== Owner ==
Line 36: Line 39:
-->
-->
* Name: [[User:Churchyard|Miro Hrončok]]
* Name: [[User:Churchyard|Miro Hrončok]]
* Name: [[User:pviktori|Petr Viktorin]]
<!-- Include you email address that you can be reached should people want to contact you about helping with your change, status is requested, or technical issues need to be resolved. If the change proposal is owned by a SIG, please also add a primary contact person. -->
<!-- Include you email address that you can be reached should people want to contact you about helping with your change, status is requested, or technical issues need to be resolved. If the change proposal is owned by a SIG, please also add a primary contact person. -->
* Email: mhroncok@redhat.com
* Email: mhroncok@redhat.com, pviktori@redhat.com
* Release notes owner: <!--- To be assigned by docs team [[User:FASAccountName| Release notes owner name]] <email address> -->
* Release notes owner: <!--- To be assigned by docs team [[User:FASAccountName| Release notes owner name]] <email address> -->
<!--- UNCOMMENT only for Changes with assigned Shepherd (by FESCo)
<!--- UNCOMMENT only for Changes with assigned Shepherd (by FESCo)
Line 63: Line 67:


<!-- Expand on the summary, if appropriate.  A couple sentences suffices to explain the goal, but the more details you can provide the better. -->
<!-- Expand on the summary, if appropriate.  A couple sentences suffices to explain the goal, but the more details you can provide the better. -->
=== Background ===
When a Python modules is imported, the source file (<code>*.py</code>) is automatically compiled to bytecode.
The bytecode is automatically cached in “pyc files” next to the source (e.g. <code><i>moldulename</i>.pyc</code> and <code><i>moldulename</i>.pyo</code> in Python 2.7; <code>__pycache__/<i>moldulename</i>.cpython-36*.pyc</code> in Python 3.6).
For RPM packages installed system-wide, creating cache files would generally require root priviliges. So they are included in RPMs rather than generated on import.


=== Status quo ===
=== Status quo ===


As for Fedora 28, all <code>*.py</code> files outside of the <code>/usr/lib(64)?/pythonX.Y/</code> directories are bytecompiled by <code>%{__python}</code> (<code>/usr/bin/python</code>).
For packagers' convenience, rpmbuild's <code>brp-python-bytecompile</code> script generates <code>.pyc</code>/<code>.pyo</code> cache files automatically:
This is overly magical and assumes several things (not always right):
 
* In interpreter-specific directories, such as <code>/usr/lib/python3.6/</code>, these are compiled using the appropriate interpreter.
* Some directories, such as <code>/bin</code>, <code>/sbin</code> and <code>/usr/share/doc</code>, are excluded.
* Outside these directories, modules are compiled using <code>%{__python}</code>, which is <code>/usr/bin/python</code>, which is currently Python 2. This is only done if the <code>%{__python}</code> binary is available.
 
The first two points are good, straightforward, and the automatism is limited to Python-specific directories.
The last point, however, relies on several assumptions which are not always right:
 
* All files named <code>*.py</code> are Python modules that need to be bytecompiled. (Quite an accurate heuristic, but has very bad behavior on false positives: it can affect packages that don't have anything to do with Python.)
* Any package that has <code>.py</code> files BuildRequires <code>/usr/bin/python</code>. (If it does not, the package will build differently depending on whether <code>/usr/bin/python</code> happens to be available during rpmbuild.)
* When a Python module is not in <code>/usr/lib(64)?/pythonX.Y/</code> it is intended for the <code>%{__python}</code> interpreter – currently Python 2. (But the module could also be, for example, for Python 3, or PyPy, or several of those.)


* all files named <code>*.py</code> are Python modules that need to be bytecompiled
Bytecompilation oudside Python-specific directories may be changed by redefining <code>%__python</code> to:
* when a file is not in <code>/usr/lib(64)?/pythonX.Y/</code> it is intended for the <code>/usr/bin/python</code> interpreter
 
** That is currently Python 2, but may be removed or changed to Python 3 at any point in the future.
* python3: This is documented in the guidelines as a way to enable eutomatic bytecompilation for python3. Around 70 packages use it.
** This is only the default behavior, it can be changed by redefining <code>%__python</code> to:
* python2: This *should* be done to ensure consistency when <code>%{__python}</code> is switched. Only about 2 packages does it, because the magic "just works" for the time being.
*** python3: that currently happens in various packages and is documented in the guidelines as a way to do it
 
*** python2: nobody does that, because the magic "just works" for this use case - relying on the fact that <code>/usr/bin/python</code> is currently python2
Automatic bytecompilation oudside Python-specific directories cannot be disabled without disabling ''all'' bytecompilation. Also, it can not be done for more than one interpreter.
**** that assumption is forbidden by the Python packaging guidelines, yet here it is heavily used


See [https://fedoraproject.org/w/index.php?title=Packaging:Python_Appendix&oldid=419140#Manual_byte_compilation Packaging:Python Appendix] for more information (this links to a specific revision so the link makes sense once this change is implemented and the guidelines are changed).
See [https://fedoraproject.org/w/index.php?title=Packaging:Python_Appendix&oldid=419140#Manual_byte_compilation Packaging:Python Appendix] for more information (this links to a specific revision so the link makes sense once this change is implemented and the guidelines are changed).


The current behavior is magical. [https://pagure.io/copr/copr/c/8fa8fe2f1583088a0162d89a13e0dee70a0db801?branch=master Mistakes are made.] Things are done or not done based on the presence of <code>/usr/bin/python</code>. See a simple example of a package that builds fine without <code>/usr/bin/python</code> in the buildroot but fails when it's there.
The current behavior is surprising. [https://pagure.io/copr/copr/c/8fa8fe2f1583088a0162d89a13e0dee70a0db801?branch=master Mistakes are made.] Things are done or not done based on the presence of <code>/usr/bin/python</code>.
 
See a simple example of a package that builds fine without <code>/usr/bin/python</code> in the buildroot but fails when it's there:


  Name:          reproducer
  Name:          reproducer
Line 110: Line 132:
=== How we are changing it ===
=== How we are changing it ===


For the time being, we keep the old behavior working.
For the time being, we keep the old behavior working, so the hundreds of packages that implicitly rely on it do not break all at once.
However, we will not (automatically) apply it to Python 3, so it will be phased out as packages switch to Python 3.


A opt-out mechanism for this automagic compilation will be provided (such as <code>%?disable_automagic_pybytecompile</code>). This will only opt-out from the compilation of files outside of <code>/usr/lib(64)?/pythonX.Y/</code>. Speaking code, this will disable [https://github.com/rpm-software-management/rpm/blob/ab3fab29de51c7e68c9911d3b7809109da92fa6d/scripts/brp-python-bytecompile#L82 the script from this point forward].
A opt-out mechanism for automatic compilation of files outside of <code>/usr/lib(64)?/pythonX.Y/</code> will be provided (such as <code>%?disable_automagic_pybytecompile</code>).
Speaking code, this will disable [https://github.com/rpm-software-management/rpm/blob/ab3fab29de51c7e68c9911d3b7809109da92fa6d/scripts/brp-python-bytecompile#L82 the final part of the brp-python-bytecompile script].


Guidelines will be adjusted to say the following:
Guidelines will be adjusted to say the following:


* if you have <code>*.py</code> files outside of the <code>/usr/lib(64)?/pythonX.Y/</code>, you '''MUST''' disable the automagic and compile them explicitly
* if you have <code>*.py</code> files outside of the <code>/usr/lib(64)?/pythonX.Y/</code>, you '''MUST''' disable their automatic compilation, and compile them explicitly by the <code>%py_byte_compile</code> macro.
* explicit compilation is done by the <code>%py_byte_compile</code> macro


Example for package that has both Python versions:
Example for package that has both Python versions:
Line 156: Line 179:
The Python 2 only example is analogical.
The Python 2 only example is analogical.


Currently, <code>%py_byte_compile</code> lives in {{package|python3-devel}}, we'll move it to some generic package (such as {{package|python-rpm-macros}} ).
The notion of redefining <code>%{__python}</code> will be removed from the guidelines.
 
Currently, <code>%py_byte_compile</code> lives in {{package|python3-devel}}. We'll move it to some generic package that all Python devel packages require (such as {{package|python-rpm-macros}} ).


Analogically, we'll also provide <code>%?enable_automagic_pybytecompile</code> for packagers to explicitly say they rely on the current behavior.
Analogically, we'll also provide <code>%?enable_automagic_pybytecompile</code> for packagers to explicitly say they rely on the current behavior.
Later (i.e. not in Fedora 29, but approximately when <code>/usr/bin/python</code> stops being python2), we'll make the old behavior opt-in (or disbale it entirely if no package uses  <code>%?enable_automagic_pybytecompile</code>).
Later (i.e. not in Fedora 29, but approximately when <code>%{__python}</code> stops being python2), we'll make the old behavior opt-in (or disbale it entirely if no package uses  <code>%?enable_automagic_pybytecompile</code>).
 
All <code>brp-python-bytecompile</code> changes will be shared and discussed in upstream RPM.
 
We will provide pull requests for the ~50 packages that redefine <code>__python</code> to </code>%{__python3}</code> (the currently recommended way to enable automatic byte-compilation on Python 3).


== Benefit to Fedora ==
== Benefit to Fedora ==


More explicit specfiles when it comes to Python byte compilation. This will ease the change once we decide <code>/usr/bin/python</code> is no longer python2.
More explicit specfiles when it comes to Python byte compilation. This will ease the change once we decide <code>/usr/bin/python</code> is no longer python2.
The new guidelines will be less error prone. Note that we'd prefer to switch to the new behavior right now, but we keeping it opt-in not to break the ~500 packages that use it.
The new guidelines will be less error prone.
 
 
Note that we'd prefer to switch to the new behavior right now, but we keep it opt-in to not break the ~500 packages that use it.
<!-- What is the benefit to the platform?  If this is a major capability update, what has changed?  If this is a new functionality, what capabilities does it bring? Why will Fedora become a better distribution or project because of this proposal?-->
<!-- What is the benefit to the platform?  If this is a major capability update, what has changed?  If this is a new functionality, what capabilities does it bring? Why will Fedora become a better distribution or project because of this proposal?-->


== Scope ==
== Scope ==
* Proposal owners: make it work technically, propose the new guidelines
* Proposal owners: Make it work technically, propose the new guidelines, file pull requests for python3 modules that follow the current guidelines.
<!-- What work do the feature owners have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
<!-- What work do the feature owners have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->


* Other developers: may opt-in for the new behavior or explicitly stick with the old one (not a System Wide Change, they don't have to do anything) <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Other developers: Maintainers of python3 packages that redefine <code>%__python</code> should merge provided pull requests. Others may opt-in for the new behavior or explicitly stick with the old one (not a System Wide Change, they don't have to do anything). <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- What work do other developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
<!-- What work do other developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->


Line 214: Line 245:
<!-- If this change proposal is noticeable by its target audience, how will their experiences change as a result?  Describe what they will see or notice. -->
<!-- If this change proposal is noticeable by its target audience, how will their experiences change as a result?  Describe what they will see or notice. -->
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
The users of this change are packagers. The new behavior should lower their pain.
The users of this change are packagers. The new behavior should make byte-compilation more obvious, explicit, and discoverable.
Users of Fedora should not feel this, except if somebody finds a bug that will be fixed by this change when opted in for the buggy package).
Users of Fedora should not feel this (except if this change uncovers a packaging bug).


== Contingency Plan ==
== Contingency Plan ==

Revision as of 13:22, 2 February 2018

Important.png
Incomplete
This page is incomplete!


No more automagic Python bytecompilation

Summary

The current way of automatic Python byte-compiling of files outside Python-specific directories is too magical and error-prone. It is built on heuristics that are increasingly wrong. We will provide a way to opt-out of it and adjust the guidelines to prefer explicit bytecompilation of such files. Later, the old behavior will be opt-in only or will cease to exist.

Note that bytecompilation in Python-specific directories (e.g. /usr/lib/python3.6/) is not affected.

Owner

Current status

  • Targeted release: Fedora 29
  • Last updated: 2018-02-02
  • Tracker bug: <will be assigned by the Wrangler>

Detailed Description

Background

When a Python modules is imported, the source file (*.py) is automatically compiled to bytecode. The bytecode is automatically cached in “pyc files” next to the source (e.g. moldulename.pyc and moldulename.pyo in Python 2.7; __pycache__/moldulename.cpython-36*.pyc in Python 3.6).

For RPM packages installed system-wide, creating cache files would generally require root priviliges. So they are included in RPMs rather than generated on import.

Status quo

For packagers' convenience, rpmbuild's brp-python-bytecompile script generates .pyc/.pyo cache files automatically:

  • In interpreter-specific directories, such as /usr/lib/python3.6/, these are compiled using the appropriate interpreter.
  • Some directories, such as /bin, /sbin and /usr/share/doc, are excluded.
  • Outside these directories, modules are compiled using %{__python}, which is /usr/bin/python, which is currently Python 2. This is only done if the %{__python} binary is available.

The first two points are good, straightforward, and the automatism is limited to Python-specific directories. The last point, however, relies on several assumptions which are not always right:

  • All files named *.py are Python modules that need to be bytecompiled. (Quite an accurate heuristic, but has very bad behavior on false positives: it can affect packages that don't have anything to do with Python.)
  • Any package that has .py files BuildRequires /usr/bin/python. (If it does not, the package will build differently depending on whether /usr/bin/python happens to be available during rpmbuild.)
  • When a Python module is not in /usr/lib(64)?/pythonX.Y/ it is intended for the %{__python} interpreter – currently Python 2. (But the module could also be, for example, for Python 3, or PyPy, or several of those.)

Bytecompilation oudside Python-specific directories may be changed by redefining %__python to:

  • python3: This is documented in the guidelines as a way to enable eutomatic bytecompilation for python3. Around 70 packages use it.
  • python2: This *should* be done to ensure consistency when %{__python} is switched. Only about 2 packages does it, because the magic "just works" for the time being.

Automatic bytecompilation oudside Python-specific directories cannot be disabled without disabling all bytecompilation. Also, it can not be done for more than one interpreter.

See Packaging:Python Appendix for more information (this links to a specific revision so the link makes sense once this change is implemented and the guidelines are changed).

The current behavior is surprising. Mistakes are made. Things are done or not done based on the presence of /usr/bin/python.

See a simple example of a package that builds fine without /usr/bin/python in the buildroot but fails when it's there:

Name:           reproducer
Version:        0.1
Release:        1%{?dist}
Summary:        Reproducer for a bytecompile script issue
License:        MIT
BuildArch:      noarch

%description
This package will build fine if /usr/bin/python is *not* in the buildroot.

%prep
echo "Poland" > country-name.pl
echo "Paraguay" > country-name.py
echo "Saint Helena" > country-name.sh
echo "Serbia" > country-name.rs

%build

%install
mkdir -p %{buildroot}%{_datadir}/%{name}
cp country-name.* %{buildroot}%{_datadir}/%{name}

%files
%dir %{_datadir}/%{name}/
%{_datadir}/%{name}/country-name.??


How we are changing it

For the time being, we keep the old behavior working, so the hundreds of packages that implicitly rely on it do not break all at once. However, we will not (automatically) apply it to Python 3, so it will be phased out as packages switch to Python 3.

A opt-out mechanism for automatic compilation of files outside of /usr/lib(64)?/pythonX.Y/ will be provided (such as %?disable_automagic_pybytecompile). Speaking code, this will disable the final part of the brp-python-bytecompile script.

Guidelines will be adjusted to say the following:

  • if you have *.py files outside of the /usr/lib(64)?/pythonX.Y/, you MUST disable their automatic compilation, and compile them explicitly by the %py_byte_compile macro.

Example for package that has both Python versions:

# Turn off the brp-python-bytecompile automagic
%?disable_automagic_pybytecompile

# Buildrequire both python2 and python3
BuildRequires: python2-devel python3-devel

%install
# Installs a python2 private module into %{buildroot}%{_datadir}/mypackage/foo
# and installs a python3 private module into %{buildroot}%{_datadir}/mypackage/bar
make install DESTDIR=%{buildroot}

# Manually invoke the python byte compile macro for each path that needs byte
# compilation.
%py_byte_compile %{__python2} %{buildroot}%{_datadir}/mypackage/foo
%py_byte_compile %{__python3} %{buildroot}%{_datadir}/mypackage/bar

Note that unlike the current example in the guidelines linked above, this does not disable the compilation of files in /usr/lib(64)?/pythonX.Y/.

Example for Python 3 only:

# Turn off the brp-python-bytecompile automagic
%?disable_automagic_pybytecompile

BuildRequires: python3-devel

%install
# Installs a python3 private module into %{buildroot}%{_datadir}/mypackage/bar
make install DESTDIR=%{buildroot}

# Manually invoke the python byte compile macro for each path that needs byte
# compilation.
%py_byte_compile %{__python3} %{buildroot}%{_datadir}/mypackage/bar

The Python 2 only example is analogical.

The notion of redefining %{__python} will be removed from the guidelines.

Currently, %py_byte_compile lives in Package-x-generic-16.pngpython3-devel. We'll move it to some generic package that all Python devel packages require (such as Package-x-generic-16.pngpython-rpm-macros ).

Analogically, we'll also provide %?enable_automagic_pybytecompile for packagers to explicitly say they rely on the current behavior. Later (i.e. not in Fedora 29, but approximately when %{__python} stops being python2), we'll make the old behavior opt-in (or disbale it entirely if no package uses %?enable_automagic_pybytecompile).

All brp-python-bytecompile changes will be shared and discussed in upstream RPM.

We will provide pull requests for the ~50 packages that redefine __python to %{__python3} (the currently recommended way to enable automatic byte-compilation on Python 3).

Benefit to Fedora

More explicit specfiles when it comes to Python byte compilation. This will ease the change once we decide /usr/bin/python is no longer python2. The new guidelines will be less error prone.

Note that we'd prefer to switch to the new behavior right now, but we keep it opt-in to not break the ~500 packages that use it.


Scope

  • Proposal owners: Make it work technically, propose the new guidelines, file pull requests for python3 modules that follow the current guidelines.
  • Other developers: Maintainers of python3 packages that redefine %__python should merge provided pull requests. Others may opt-in for the new behavior or explicitly stick with the old one (not a System Wide Change, they don't have to do anything).
  • Policies and guidelines: will be changed as described in description
  • Trademark approval: not needed

Upgrade/compatibility impact

None expected.

How To Test

More specific instructions based on the examples in description will be here once ready. In the meantime, feel free to test the examples in the description as you see fit.

User Experience

The users of this change are packagers. The new behavior should make byte-compilation more obvious, explicit, and discoverable. Users of Fedora should not feel this (except if this change uncovers a packaging bug).

Contingency Plan

  • Contingency mechanism: we'll finish the change later (not a System Wide Change)
  • Contingency deadline: none (not a System Wide Change)
  • Blocks release? no (not a System Wide Change)
  • Blocks product? no

Documentation

The guidelines will be the documentation.

Release Notes

This change does not deserve Release Notes, it is not user facing.