From Fedora Project Wiki

Revision as of 08:04, 25 October 2023 by Mschorm (talk | contribs)

This documents describes methodology for removing support for 'N' amount of architectures for 'K' amount of packages that exists in Fedora.

Background

I hope this document will become universally applicable to the following scenario:

  • There are 'K' packages, already released in Fedora, built for several architectures
  • These 'K' packages provides 'L' names
  • There are other packages that depend on any of these 'L' names provided by these 'K' packages and their 'M' sub-packages build-time, run-time or both
  • You want to stop building those 'K' packages on 'N' architectures, but you want to make sure, all packages that depend on the 'K' packages will be properly taken care of
  • 'K', 'L', 'M', 'N' are integers
  • There will be at least one architecture left after you're done, and it is NOT 'noarch'

My original case

This is my original case, on which I'll demonstrate the whole process.

I maintain packages 'mariadb' and 'community-mysql' for which I wanted to drop support for the i686 architecture.
This was done as a part of this change proposal: "F40 MariaDB MySQL repackaging"
And the reason for dropping the architecture was saving maintainer time, energy and computational resources.
It´s based on this accepted change: "Encourage i686 Leaf Removal"

These package provide it's names, additional names, libraries, pkgconfigs, and so on.
A lot of other packages relies on them - as they are huge, well established databases - both build-time and run-time.

Part of the stack are database connectors to various languages, which I also maintain.
One such is 'mariadb-connector-c' which is special. It provides the client library for MariaDB and it's development files.
And since the MariaDB and MySQL are compatible - regarding the client side - this library can server for both.

I know a lot of very important packages relies on those databases, but I never kept track of how big is that dependency tree exactly.

RPM symbols

An RPM package can provide a lot of symbols. I´ll divide them to two categories: "names", and "other provides".
RPM "names" consist of names provided by the RPM - either implicitly (package name) or explicitly (names you specify with Provides: in the SPECfile).
RPM "other provides" consists of stuff like libraries, pkg-config files, and more.

You can check what an RPM package provides with rpm -q --provides <PATH_TO_RPM>
An example:

# rpm -q --provides mariadb-embedded-10.5.22-1.fc40.x86_64.rpm
libmariadbd.so.19()(64bit)
mariadb-embedded = 3:10.5.22-1.fc40
mariadb-embedded(x86-64) = 3:10.5.22-1.fc40
mysql-embedded = 3:10.5.22-1.fc40
mysql-embedded(x86-64) = 3:10.5.22-1.fc40

Other packages can depend on any of the symbols provided by your package.
So the first step has to be finding out, what all symbols your packages provide.

We will always start with Fedora Rawhide, to avoid regressions in our work, and then - if needed - move to the older releases.

Package repositories have web UI called Pagure in Fedora: https://src.fedoraproject.org/rpms/mariadb
On the front page of a given package you can see production builds both released and in testing (in BODHI).
Clicking on any such link will take you to the KOJI to the specific build: https://koji.fedoraproject.org/koji/buildinfo?buildID=2295452
There you can see list of artifacts (RPM packages, logs, ...). Clicking on "info" link next to any package will bring you to the detail of the the RPM package: https://koji.fedoraproject.org/koji/rpminfo?rpmID=35996855
There you can see the symbols the package provides and requires.

However, we want to gather these data in an automated way.

See the get_rpm_symbols.py script I've created: https://gitlab.com/.../get_rpm_symbols.py
It will get you all the symbols all of the selected RPM packages provides in any of their sub-packages.
You can take a look at the get_rpm_symbols.py-RESULTS/STDOUT file containing copy of what the script produces on the standard output and the get_rpm_symbols.py-RESULTS/results_sorted file to see the results, with which we will work later.

Now we have a clean list of symbols.
For each, we want to check, whether there is any package that requires such symbol. Separately for build-time and for run-time.
See get_dependency_tree.sh

Now we have directories with a file for every combination (repository * RPM symbol). Lot of them are likely to be empty.
To isolate just the meaningful output, use the isolate_meaningful_results.sh script.

Build-time dependency tree

This is the easier part.

The last command (isolate_meaningful_results.sh) also prints the isolated files and their content to the STDOUT. The first part looks something like this (content trimmed):

 == ISOLATING BUILD TIME DEPENDENCIES == 

FILE: rawhide-community-mysql isolated
perl-DBD-MySQL-0:5.001-2.fc40.src

FILE: rawhide-community-mysql-devel isolated
mysql-connector-odbc-0:8.0.33-2.fc39.src
perl-DBD-MySQL-0:5.001-2.fc40.src

FILE: rawhide-mariadb-server isolated
kf5-akonadi-server-0:23.08.2-2.fc40.src
perl-DBD-MariaDB-0:1.23-2.fc40.src
perl-Test-mysqld-0:1.0013-11.fc39.src
python-asyncmy-0:0.2.8-4.fc39.src
python-aws-xray-sdk-0:2.12.0-2.fc40.src
python-databases-0:0.8.0-1.fc40.src
rubygem-mysql2-0:0.5.4-6.fc39.src

FILE: rawhide-pkgconfig(mariadb) isolated
Macaulay2-0:1.21-5.fc39.src

FILE: rpmfusion-free-mariadb-devel isolated
kodi-0:19.4-4.fc37.src
zoneminder-0:1.36.31-1.fc37.src

You can see that all of the packages listed are the Source packages (*.src). That's because for the build-time dependencies, we only looked in the source repository. Another benefit is that the package name displayed here matches the actual package name. That won't always be the case for run-time dependency tree, as we will see later.

You can see that some packages depend on <your-pkg>-devel or pkgconfig(your-pkg-library-name). That's the standard way of building on top of other packages.

Some packages however requires the database server, client or some other part. In the case of my packages, it usually means, that the other package tries to set up and execute a database server during package build time, to run tests. This is a good reason to run package tests separately from the build process. Polluting the package buildroot with packages that are not needed for the build is unfortunate. However as in many cases, package maintainers see value into such an early feedback from the tests, we get this scenario.

First of all, you should check, whether the they are true positives
The *.src packages in KOJI are the source packages for the latest production build done for the specified Fedora release. That may be different from the latest code in dist-git, as not every commit to dist-git needs to have a production build. For some it doesn't even make sense. Also, some package are built only once per every release - usually those without any upstream updates and bug reports.

It is likely that the latest content of the dist-git will match the content of the *.src RPMs when you begin your work, however as you progress, you may see increasingly more false positives, as fixing build-time dependency tree doesn't require the packages to be rebuilt. They will just pick up the changes next time they are built for production, whenever it may be. For Fedora Rawhide, every package has to be rebuilt at least once, by release engineering, in order for it to get into the newly forked Fedora release.

As a second step, you should check, whether the dependencies are justified.
Sometimes, the dependencies aren´t necessary, or aren't correct. That may happen when the maintainer of the other package does not understand the dependent package well, and just found out solution, without it being optimal.

In my case, many packages required the database server development files ('mariadb-devel', 'mysql-devel', ..., while in fact they needed just the database client development files, which exists in an entirely different package (mariadb-connector-c-devel). However as, in my case, the DB server *-devel sub-package requires the DB client *-devel sub-package, the correct package was brought into the build-root too, but the solution was far from optimal.

As a third step, you should check, whether the source package produces noarch RPM(s)
Since noarch RPMs are not bound to any specific architecture, they don't suffer by their dependencies dropping architecture support, or even becoming noarch.

There is a catch, however.
The noarch packages still has to be built in the KOJI on some architecture. As far as I know, the architecture for the build of a noarch package is selected randomly.
So when the noarch package depend on arch-specific package, which has ExcludeArch: for some architecture(s), the KOJI builder selected might be of an architecture, on which the required dependency does not exist, and thus FTBFS.
How to avoid this, I don't know.
I've only found out, that COPR, unlike KOJI, builds noarch packages on every selected architecture (specifically: in every selected mock config), so such issues are easier to spot there, as they are not random.