From Fedora Project Wiki

Revision as of 22:28, 15 December 2009 by Toshio (talk | contribs) (Initial work on updating Layout for problemsidentified during fudcon)

Packaging Python modules for Python 3

I hope to add a parallel-installable Python 3 stack to Fedora 13.

See the feature page: https://fedoraproject.org/wiki/Features/Python3F13 and also this thread: https://www.redhat.com/archives/fedora-devel-list/2009-October/msg00054.html

This requires us to come up with a sane way to package Python 3 modules, and this requires us to generalize our python packaging rules to support more than one python runtime.

The existing Python packaging guidelines are here: Packaging/Python

Multiples Python Runtimes

There will be multiple python runtimes, one for each supported major release.

Each runtime corresponds to a binary of the form /usr/bin/python$MAJOR.$MINOR

One of these python runtimes is the "system runtime". It can be identified by the destination of the symlink /usr/bin/python. Currently this is /usr/bin/python-2.6

Note.png
Currently /usr/bin/python is actually a duplicate copy of the ELF file, rather than a symlink; we see this as a bug

The output of "rpm -q --provides" of each runtime rpm MUST contain a line of the form:

 Provides: python(abi) = $MAJOR-$MINOR

For example, a python-3.1 runtime rpm should have this output:

 Provides: python(abi) = 3.1

Similarly, python modules using these runtimes should have a corresponding "Requires" line.

Note.png
The script /usr/lib/rpm/pythondeps.sh automatically emits "Requires" lines for files below /usr/lib[^/]*/python${PYVER} for the main python 2 stack, but will need reworking for python 3; I've rewritten the script, but it isn't yet in our F13 rpm-build rpm. This is being tracked as [bug 532118].
Note.png
We supply the "Provides" manually in the specfile for each runtime. In theory /usr/lib/rpm/pythondeps.sh would also automatically generate "Provides" lines for the runtime, but in practice rpmbuild only invokes it for files in the rpm payload identified as "python" by the file utility, and the runtime is an ELF binary, not a python script, hence it isn't passed. It's simplest to manually supply the Provides line, rather than change these innards of rpmbuild. See [bug 532118].

Byte Compiling

When byte compiling a .py file, python embeds a magic number in the byte compiled files that correspond to the runtime. Files in {%python_sitelib} and %{python_sitearch} must correspond to the runtime for which they were built. For instance, a pure python module compiled for the 3.1 runtime needs to be below %{_usr}/lib/python3.1/site-packages

Normally, this is done for you by the brp-python-bytecompile script. This script runs after the %install section of the spec file has been processed and byte-compiles any .py files that it finds (this recompilation puts the proper filesystem paths into the modules otherwise tracebacks would include the %{BUILDROOT} in them). The script determines which interpreter to byte compile the module with by following these steps:

  1. what directory is the module installed in? If it's /usr/lib/pythonX.Y, then pythonX.Y is used to byte compile the module. If pythonX.Y is not installed, then an error is returned and the rpm build process will exit on an error so remember to BuildRequire the proper python package.
  2. the script interpreter defined in __python is used to compile the modules. This defaults to the latest python2 version on Fedora. If you need to compile this module for python3, set it to /usr/bin/python3 instead. Like this:
    %global __python %{__python3}
    

    This step is useful when you have an application that's installing a private module into its own directory. For instance, if the foobar application installs a module for use by the command line application only in %{_datadir}/foobar. Since these files are not in a path that contains the python version (like /usr/lib/python3.1) you have to set %{__python} manually to tell brp-python-bytecompile what python interpreter to byte compile for.

These settings are enough to properly byte compile any package for a single python interpreter or a package that only builds python modules (in %{python_sitelib} or %{python_sitearch}). However, if the application you're packaging needs to build with both python2 and python3 and install into a private module directory (perhaps because it provides one utility written in python2 and a second utility written in python3) then you need to do this manually. Here's a sample spec file snippet that shows what to do:

# Turn off the brp-python-bytecompile script
%global brp_python_bytecompile %{nil}
# Buildrequire both python2 and python3
BuildRequires: python-devel python3-devel
[...]

%install
# Installs a python2 private module into %{BUILDROOT}%{_datadir}/mypackage/foo
# Installs a python3 private module into %{BUILDROOT}%{_datadir}/mypackage/bar
make install DESTDIR=%{BUILDROOT}

# Manually invoke the python byte compile macro for each path that needs byte
# compilation.
%{py_byte_compile} /usr/bin/python2 %{BUILDROOT}%{_datadir}/mypackage/foo
%{py_byte_compile} /usr/bin/python3 %{BUILDROOT}%{_datadir}/mypackage/bar

Python modules for non-standard runtimes

Naming

Current python package naming guidelines are here: Packaging/NamingGuidelines#Addon_Packages_.28python_modules.29

  • an rpm with a python- prefix means a python 2 rpm, of the "default" python 2 minor version (for Fedora this will be the most recent stable upstream minor release, for EPEL it will be the minor release of 2 that came with the distro, so 2.4 for EPEL5)
  • an rpm with a python3- prefix means a python 3 rpm, of the "default" python 3 minor version (for Fedora this will be the most recent stable upstream release)

What about packages without a "python-" prefix?

See https://www.redhat.com/archives/fedora-python-devel-list/2009-October/msg00015.html for a list of F-12 packages emitted by

 repoquery -f '/usr/lib/python2.6/site-packages/*'

divided into 4 categories:

  • Packages starting with 'python-'
  • Packages starting with 'Py' or 'py' (but not 'python-')
  • Packages ending with '-python':
  • None of the above

Proposal: If upstream has a naming convention for python2 vs python3, use it. Otherwise, use a python3- prefix, followed by the name of the module that you type to import it in a script, even if this is inconsistent with the python 2 name of the rpm.

Rationale: this highlight the "threeness" of the packages, making it very clear which stack they are for. Python 3 and Python 2 are different stacks, so any inconsistencies aren't a serious problem.

Fedora python 2 package Upstream name Proposed python 3 package name
python-lxml lxml python3-lxml
pygtk2 python3-gtk
gstreamer-python python3-gstreamer
gnome-python2 gnome-python python3-gnome
rpm-python python3-rpm

Common SRPM vs split SRPMs

Note.png
FPC Guidance
if i were writing that section, i would probably say that if the python 2 and 3 bindings exist only in the same tarball, they should be generated from a single SRPM. If the python 3 bindings are part of a separate upstream source package, they should be in their own SRPM.

There are two approaches I'm experimenting with to packaging modules for python 3:

  • create an separate specfile/srpm for the python 3 version
  • extend an existing specfile so that it emits a python3- subpackage as part of the build.

I've experimented with both approaches for python3-setuptools

Split/separate SRPMs: a src.rpm for python- and another for python3-

Given package python-foo in packaging CVS, there would be a separate python3-foo for the python 3 version. There would be no expectation that the two would need to upgrade in lock-step. (The two SRPMS could have different maintainers within Fedora: the packager of a python 2 module might not yet have any interest in python 3)

Example: python3-setuptools https://bugzilla.redhat.com/show_bug.cgi?id=531648 (simple adaptation of python-setuptools, apparently without needing an invocation of 2to3)

Dave Malcolm has written a tool which generates a python3-foo.spec from a python-foo.spec; see http://dmalcolm.fedorapeople.org/python3-packaging/rpm2to3.py

Advantages:

  • if the python-foo maintainer doesn't care about python 3, he/she doesn't need to
  • the two specfiles can evolve separately; if 2 and 3 need to have different versions, they can

Disadvantages:

  • the two specfiles have to be maintained separately
  • when upstream release e.g. security fixes, they have to be tracked in two places

Single shared SRPM emitting both python- and python3- subpackages

Method

  • Use the -n syntax to emit a python3-foo subpackage from a python-foo build.
  • Towards the end of the %prep phase, copy the code to a parallel subdirectory, and invoke 2to3 --write upon it

Examples:

Advantages:

  • single src.rpm and build; avoid having to update multiple packages when things change.

Disadvantages:

  • The Fedora maintainer needs to care about python 3. By adding python 3 to the mix, we're giving them extra work.
  • 2 and 3 versions are in lockstep. Requires upstream to case about Python 3 as well (or for Python 2, for that matter)
  • Bugzilla components are set up by source RPM, so they would have a single shared bugzilla component. This could be confusing to end-users, as it would be more difficult to figure out e.g. that a bug with python3-foo needs to be filed against python-foo. There's a similar problem with checking out package sources from CVS, though this is less serious as it doesn't affect end-users so much.

When should we have two split SRPMs vs one shared, and vice versa?

The easy case is when upstream release separate tarballs for the python 2 and python 3 versions of code. In that case, it makes sense to follow upstream and have separate specfiles, separate source rpms, etc.

The more difficult case is when the python module is emitted as part of the build of a larger module.

One case is for an extension module giving python bindings for a library built within the larger rpm. Some examples:

I believe the ideal here is to patch the code so that it will build against both python versions, then take a copy of the sources during the %prep phase, and configure one subdirectory to build against python 2, another to build against python 3.

Packaging

In python3-3.1.1-9 onwards the python3-devel subpackage will contain a /etc/rpm/macros.python3 file which will contain definitions of:

 __python3
 python3_sitelib
 python3_sitearch

which should thus make it unnecessary to define these in every module specfile (see https://bugzilla.redhat.com/show_bug.cgi?id=526126#c43 ).

Some possible best-practices for keeping python 2 and python 3 in sync:

  • when packaging a module for python 3, you should approach the python 2 package owners.
  • if separate maintainership for python 2 vs python 3 modules, you should request a watchbugzilla and watchcommit on each other's packages
  • complete any python 2 Merge Review before doing a python 3 version
  • add link to the python 2 Merge Review/Package Review to the python 3 Package Review
  • remember to test the built RPMs and verify that they actually work!