Automatic RPM dependencies on Python Extras
Summary
The Python RPM dependency generator (that generates python3.Xdist(foo)
requirements) will be adapted to also generate requirements on Python extras (e.g. python3.Xdist(foo[bar])
) whenever upstream metadata indicate such dependency. An easy opt out mechanism will exist. A supported way of adding metapackages that provide such Python extras (e.g. python3.Xdist(foo[bar])
) will be introduced. Change owners will add the missing metapackages that would otherwise cause broken dependencies (in non-modular packages).
Owner
- Name: Tomáš Orsava
- Name: Miro Hrončok
- Email: <python-maint@redhat.com>
Current status
Detailed Description
The problem
Python extras are a way for a Python package (called "distribution" or "distribution package" upstream) to declare that extra dependencies are required for additional functionality.
For example Python package requests
has several standard dependencies (e.g. urllib3
). But it also declares an extra named requests[security]
which lists additional dependencies (e.g. pyOpenSSL
) if you want to use this additional functionality. The Python package code handles the missing optional dependency gracefully -- e.g. it won't crash but might instruct the user to install requests[security]
if needed by a warning or an actionable error message.
Python packages included in Fedora as RPMs automatically create a special Provides in the format python3.Xdist(foo)
(and python3dist(foo)
) where foo
is the upstream Python package distribution name and X is the Python minor version. That way you can require any Python package without knowing under which name it was packaged in Fedora. And these tags are also automatically used by the Python dependency generator, which reads upstream Python metadata and creates dependencies on these Provides.
However, Python extras are not yet handled by the Provides tags which leads to imperfections and problems in declared dependencies.
Status quo
Currently in Fedora (before this change), no package provides python3.Xdist(foo[bar])
for the foo[bar]
Python extra. As a direct result of this, no package can require it. The automatic RPM Python dist dependency generator only generates an incomplete requirement on the base package (python3.Xdist(foo)
) in such cases.
The transitive extra dependencies were often needed to be hardcoded manually. I.e. when foo
requires bar[baz]
, package bar
does not require the additional dependencies for the bar[baz]
extra. Thus foo
needs to hardcode those dependencies manually. For example: [1]. This leads to possibly missing, broken and/or outdated superfluous dependencies.
Extras metapackages
In this change proposal, we propose to solve the problem using metapackages. The following metapackage represents the setuptools_scm[toml]
extra for the python3-setuptools_scm
RPM package (python-setuptools_scm
source package):
%package -n python3-setuptools_scm+toml Summary: Metapackage for python3-setuptools_scm: toml extra Requires: python3-setuptools_scm = %{?epoch:%{epoch}:}%{version}-%{release} %description -n python3-setuptools_scm+toml This is a metapackage bringing in toml extra requires for python3-setuptools_scm. It contains no code, just makes sure the dependencies are installed. %files -n python3-setuptools_scm+toml %ghost %{python3_sitelib}/*.egg-info
Notice several things:
- The package has a hard dependency on
python3-setuptools_scm = %{?epoch:%{epoch}:}%{version}-%{release}
. While this could be in theory generated by the dependency generator, the change owners have decided not to do that to allow certain leeway for experimentation. However, the dependency will created by the macro helper below. Technically,%{?_isa}
should also be used for arched packages, but in practice we believe it can be omitted.
- The package contains no files except the
%ghost
metadata. This is needed for the dependency generator to have access to the upstream metadata of this package.
The updated RPM Python dist dependency generator parses the extras name from the subpackage name by splitting it on the +
sign.
This naming scheme is not new, it is copied from Rust packaging. Five Python packages in Fedora already use the same scheme for similar metapackages representing Python extras. And normalized Python distribution package names (or extras names) don't naturally contain the +
sign. (Neither do existing Fedora packages prefixed with python3-
, except the 5 components already mentioned.)
The metapackage can have additional features if desired. For example:
- It can obsolete/provide other names (e.g. obsoleted extras packages)
- It can have manual strong or weak dependencies on other (possibly non-Python) packages
- It can contain files excluded from the "base" package (if such files only make sense with the extra and the base package does not fail without them)
The "base" package (in this case python3-setuptools_scm
) can optionally Require/Recommend/Suggest a Python extras metapackage if the packager deems it useful.
The change for the RPM Python dist dependency generator is prepared in:
- https://github.com/torsava/rpm/pull/2 (PR for upstream RPM will follow after this change is discussed in Fedora)
- https://src.fedoraproject.org/rpms/python-rpm-generators/pull-request/19 (to be adapted based on feedback and merged in Fedora once the change is approved)
Macro helper
For the most common case, the change owners have prepared a macro helper in https://src.fedoraproject.org/rpms/python-rpm-macros/pull-request/59
To generate the example above, it should be used like this:
%{?python_extras_subpkg:%python_extras_subpkg -n python3-setuptools_scm -i %{python3_sitelib}/*.egg-info toml}
- The
%{?python_extras_subpkg:...}
way of using this macro ensures the spec file remains valid for older Fedora/EL releases, where this code will do nothing. - The
-n
option specifies the name of the "base" package. - The
-i
option specifies the%files %ghost
path (glob) to the the metadata directory (the.dist-info
or.egg-info
directory) - The one or more positional arguments specify the extra(s) name(s) — multiple metapackages are generated when multiple names are provided.
Other possible arguments:
- The
-f
option (conflicts with-i
and-F
) can specify the relative path to the filelist for this metapackage (which should contain the%files %ghost
path (glob) to the the metadata directory). This API is prepared for integration withpyproject-rpm-macros
. - The
-F
flag (conflicts with-i
and-f
) can be used to skip the%files
section entirely (if the packager wants to construct it manually).
Note that this macro generates all the subpackage definition sections (%package
including the Summary and Requires on the base package, %description
and %files
), and hence it cannot be extended with custom Provides/Obsoletes/Requires/etc.
This macro is designed to fit the most common uses. It doesn't currently cover all use cases. Packagers can, however, construct the subpackage manually if they need custom features not covered by %python_extras_subpkg
. In the future, the API of the macro can be extended if there is demand.
See the linked pull request for example outputs.
Due to technical limitations, the macro helper never generates requirement on the arched BASE_PACKAGE%{?_isa} = %{?epoch:%{epoch}:}%{version}-%{release}
. It only adds Requires: BASE_PACKAGE = %{?epoch:%{epoch}:}%{version}-%{release}
) because a macro cannot reliably detect if the subpackage is arched or not. The change owners believe the resolver will do the right thing by default. If there are problems with this approach, an additional flag (such as -a
) can be introduced to indicate an arched base package.
Why is there no automatic extras discovery?
RPM is not capable of creating dynamic subpackages based on the content in %{buildroot}
or on the unpacked sources (%{_builddir}
) yet.
Hence, we require the packager to manually list which Python extras (if any) should be packaged as metapackages. Not all extras are useful for us anyway, as there are often extras representing the build/dev/doc/test dependencies of the project.
In the future (once/if RPM supports this), the generators can be extended with auto-discovery of Python extras (with filtering).
Automatic provides generator
To continue with our example, the python3-setuptools_scm+toml
subpackage will Provide python3.Xdist(setuptools_scm[toml])
(and also python3dist(setuptools_scm[toml])
).
An attempt to package a nonexsiting extra (e.g. python3-setuptools_scm+nopenopenope
) will result in build failure with an human-readable error message.
Automatic requires generator
If a Python package requires setuptools_scm[toml]
, the Fedora RPM package will require python3.Xdist(setuptools_scm[toml])
and also python3.Xdist(setuptools_scm)
. In theory, the second requirement is redundant, but in practice, it makes it easier (and less error prone) to query package dependencies in Fedora (e.g. using dnf repoquery
).
The packaged extras will also Require additional dependencies listed in their Python metadata, in the case of python3-setuptools_scm+toml
, it will require python3.Xdist(toml)
(because on the Python level, setuptools_scm[toml]
requires toml
).
Packagers can opt out from automatically generated dependencies on Python extras by defining the %_python_no_extras_requires
macro to any value (usually 1
) in the spec file. This should be only a a temporary measure until the missing extra is packaged. If the upstream dependency information is not accurate, please work with upstream to fix it.
Coordinated effort to avoid breakage
The change owners have collected data about non-modular packages in Copr. Note that ~270 packages failed to build for unrelated reasons and hence we miss data for them. However, ~3300 packages built successfully.
The following extras metapackages will be added to avoid broken dependencies:
autobahn[twisted] cachecontrol[filecache] cairocffi[xcb] cli-helpers[styles] docker[ssh] fonttools[ufo] fonttools[unicode] ipython[notebook] lunr[languages] oauthlib[signedtoken] pyjwt[crypto] raven[flask] requests[security] requests[socks] tabulate[widechars] twisted[tls] vistir[spinner]
The following components will be modified:
python-autobahn python-CacheControl python-cairocffi python-cli-helpers python-docker fonttools ipython python-lunr python-oauthlib python-jwt python-raven python-requests python-tabulate python-twisted python-vistir
- https://src.fedoraproject.org/rpms/python-autobahn/pull-request/4
- https://src.fedoraproject.org/rpms/python-CacheControl/pull-request/7
- https://src.fedoraproject.org/rpms/python-cairocffi/pull-request/2
- https://src.fedoraproject.org/rpms/python-cli-helpers/pull-request/1
- https://src.fedoraproject.org/rpms/python-docker/pull-request/27
- https://src.fedoraproject.org/rpms/fonttools/pull-request/5
- https://src.fedoraproject.org/rpms/ipython/pull-request/14
- https://src.fedoraproject.org/rpms/python-lunr/pull-request/2
- https://src.fedoraproject.org/rpms/python-oauthlib/pull-request/2
- https://src.fedoraproject.org/rpms/python-jwt/pull-request/4
- https://src.fedoraproject.org/rpms/python-raven/pull-request/2
- https://src.fedoraproject.org/rpms/python-requests/pull-request/9
- https://src.fedoraproject.org/rpms/python-tabulate/pull-request/1
- https://src.fedoraproject.org/rpms/python-twisted/pull-request/14
- https://src.fedoraproject.org/rpms/python-vistir/pull-request/1
When we added the metapackages for these extras in our testing Copr, no new broken requires on Python extras were generated. In other words, these new extras subpackages don't require adding any more extras subpackages. No extras are required by the remaining Python 2 packages in Fedora.
Once the change in the dependency generator is deployed in rawhide, the change owners will monitor all newly added requires on missing extras and will add new metapackages as needed.
5 source packages in Fedora already have Python extras meta-subpackages with the proposed naming pattern, but they don't have any listed %files
. They will be non-intrusively adapted via pull requests — by adding the %ghost
file entry to the metapackage(s). Maintainers can then decide whether to opt for simpler rawhide only specfile with %python_extras_subpkg
or to maintain the current compatibility. This concerns the following 18 subpackages:
python3-dask+{array,bag,dataframe,delayed} python3-django-storages+{azure,boto,boto3,dropbox,libcloud,sftp} python3-dns-lexicon+{easyname,gratisdns,henet,hetzner,plesk,route53} python3-drf-yasg+validation python3-prometheus_client+twisted
- https://src.fedoraproject.org/rpms/python-dask/pull-request/1
- https://src.fedoraproject.org/rpms/python-django-storages/pull-request/1
- https://src.fedoraproject.org/rpms/python-dns-lexicon/pull-request/1
- https://src.fedoraproject.org/rpms/python-drf-yasg/pull-request/1
- https://src.fedoraproject.org/rpms/python-prometheus_client/pull-request/3
Modular packages
The change owners are only cable of monitoring and adapting non-modular packages. Due to long standing issues, we are unable to inspect, query (or do a targeted rebuild of) modular content:
- https://pagure.io/modularity/issue/160
- https://pagure.io/modularity/issue/163
- https://pagure.io/modularity/issue/165
If there are people available to help with this problem, the change owners will gladly accept their help, we are not excluding modular content because we would like to do it, but because we don't know how to work with it at scale.
How to add Python extras subpackage to my package?
In this section, we'll describe a step-by-step guide of adding the Python extras subpackage to your package. Imagine you maintain python-requests
and a maintainer of a dependent package contacts you: "I would like you to add a subpackage for requests[security]
, because my package requires it."
- Locate the
%files
section forpython3-requests
package inpython-requests.spec
. - Find the entry for
.egg-info
or.dist-info
metadata directory. If the entry is generalized with globs like%{python3_sitelib}/*
, please make the%files
section more explicit while at it. Copy the line with the metadata directory. In this guide we assume it is%{python3_sitelib}/*.egg-info
. - Locate the
%description
of thepython3-requests
package. - After the description, add:
%{?python_extras_subpkg:%python_extras_subpkg -n python3-requests -i %{python3_sitelib}/*.egg-info security}
on a separate line. - Build the package (e.g. in local mock).
- Verify the
python3-requests+security
package is built and providespython3dist(requests[security])
. - See if the new extras package doesn't have dependencies on packages missing from Fedora (extras or "basic") and proceed with adding those if needed.
- Ship the change in Fedora 33+. It should do nothing in Fedora 31/32 or current EPELs.
Packaging guidelines
The change owners will describe this concept in the Python packaging guidelines and will propose the following rules for the Fedora Packaging Committee to approve:
- Packagers MAY add Python extras metapackages as needed.
- The Python extras metapackages MUST require the base package (exact NEVR).
- Packagers MAY add strong or weak dependencies on the extras metapackages from the base package as they see fit.
- Packagers SHOULD NOT add Python extras metapackages with dependencies only useful for maintaining the package (usually extras called dev/test/doc/build/...).
- Optional: Packagers MAY package tests separately into the
[test]
or[testing]
extras subpackage.
- Optional: Packagers MAY package tests separately into the
- If a Fedora package requires a Python extra of a different package, the extras metapackage MUST be added to that package to avoid broken dependencies.
- Packagers MAY temporarily disable the automatic requires on extras subpackages (by defining
%_python_no_extras_requires
) until the missing metapackage is introduced, but they SHOULD notify the maintainer of the package they depend on about the situation. - If upstream drops an extra, even though it is discouraged by upstream documentation (see final paragraph), the metapackage SHOULD be Obsoleted from the base package or, if there is continuity, from another extras metapackage.
- If the upstream Python package name contains
+
, it MUST be replaced with-
in package names (in accordance with the upstream Python package names normalization).
Feedback
This has been briefly discussed in general terms upstream. People tend to agree that some solution is needed. The concrete proposal contained in this Fedora Change is based on the discussion, but has received no feedback yet.
After a week, there was not a single response on the Fedora devel mailing list.
Benefit to Fedora
- Packages will have more accurate automatic dependencies, and the hard-to-maintain and error prone manual transitive (and other) dependencies can be dropped.
- There will be less missing and redundant dependencies.
- Python packagers will have less manual dependencies to worry about and less problems to workaround.
- The handling of Python extras will be standardized.
- Overall, the Python ecosystem in Fedora will be closer to upstream.
Scope
- Proposal owners:
- Polish and merge the code changes for
python-rpm-generators
andpython-rpm-macros
linked above. - Add the 17 missing extras metapackages listed in this change to avoid broken dependencies (using pull requests or provenpackager powers if need be).
- Adapt the 5 existing Python extras subpackages listed in this change to work with the dependency generator (using pull requests, or provenpackager powers if need be).
- Monitor new dependencies on Python extras subpackages, add extras subpackages where needed (using pull requests, or provenpackager powers if need be).
- Propose the updated Python packaging guidelines to FPC for approval.
- Provide help and guidance for packagers.
- Optional: Prepare
pyproject-rpm-macros
integration of this change.
- Polish and merge the code changes for
- Other developers:
- No immediate action necessary.
- They can opt in for more metapackages with extras.
- They can review and merge pull requests.
- They should follow the updated Python packaging guidelines if the changes are approved by FPC.
- Release engineering: No releng impact anticipated. The new dependencies will be primarily generated by the mass rebuild, but if the mass rebuild is missed, the package maintainers or change owners can rebuild the packages that will gain the new automatic Requires is on Python extras.
- Policies and guidelines: Yes, see detailed description.
- Trademark approval: Not needed for this Change.
Upgrade/compatibility impact
No impact anticipated.
How To Test
Check that there are packages that Require python3.9dist(basename[extrasname])
. You can use the following repoquery:
dnf repoquery --repo=rawhide --whatrequires 'python3.9dist(*\[*\])'
Check that there are Python extras metapackages with the correct Provides, for example by installing the packages returned by the above query, or manually via queries like:
dnf repoquery --repo=rawhide --whatprovides 'python3.9dist(requests\[security\])'
To query all existing Python extras metapackages, you can use:
dnf repoquery --repo=rawhide --provides -a | grep -E 'python(3\.9|2\.7)dist\(\S+\[\S+\]\)'
And lastly, to query all required Python extras metapackages:
dnf repoquery --repo=rawhide --requires -a | grep -E 'python(3\.9|2\.7)dist\(\S+\[\S+\]\)'
User Experience
When installing Python RPM packages, the dependencies are more likely to fulfill user expectations, as they will more closely adhere to the behavior of pip (the Python package installer).
Dependencies
Nothing.
Contingency Plan
- Contingency mechanism: (What to do? Who will do it?)
- Soft: The change owners will disable the requirements generator by default and rebuild (or untag if FTBFS) packages with broken dependencies caused by the change.
- Hard: The change owners will revert everything and rebuild (or untag if FTBFS) packages with new requirements/provides caused by the change.
- Contingency deadline: Beta freeze
- Blocks release? No
- Blocks product? No
Documentation
The packaging guidelines will be the documentation if approved. If not, this Fedora Change shall serve as the documentation.