Koji for CI/CD
In this document I will try to summarize the discussion we have had around the current state of koji and how it influences work to be done by the CI/CD task force.
In order to meet the requirements for CI/CD, the options for interacting with Koji broadly fall into two categories:
- Have the CI pipeline do the builds in koji directly. Some changes to koji itself (such as solving NVR uniqueness) may be required.
- Have the CI pipeline register the builds in koji via the content generator interface.
How does koji work
/!\ Most of the workflow presented here is in fact enforced by Bodhi, koji itself builds and tags builds, it does not enforce anything.
Koji uses tags and tags can be linked between each other. For example for Fedora 26 the tag hierarchy is as follow:
f26 https://koji.fedoraproject.org/koji/taginfo?tagID=357 \_ f26-updates https://koji.fedoraproject.org/koji/taginfo?tagID=358 | \_ f26-updates-pending https://koji.fedoraproject.org/koji/taginfo?tagID=362 | \_ f26-updates-testing https://koji.fedoraproject.org/koji/taginfo?tagID=360 | \_ f26-updates-testing-pending https://koji.fedoraproject.org/koji/taginfo?tagID=361 | \_ f26-updates-candidate https://koji.fedoraproject.org/koji/taginfo?tagID=359 \ _ f26-overrides \_ f26-build
When a packager does an official build, this build is tagged into f26-updates-candidate. When the update is created in bodhi, this one checks the presence of the corresponding *-updates-candidate tag and will refuse to create the update otherwise. If the build is properly tagged, it will be tagged: f26-updates-testing-pending. Once the build is pushed to updates-testing, the tags f26-update-candidate and f26-updates-testing-pending are removed while the tag f26-updates-testing is added. When the build is submitted to update stable, bodhi tags it in f26-updates-pending. When the build is then pushed to stable, the tags f26-updates-testing and f26-updates-pending are removed and the tag f26-updates is added.
In other to words:
Build => [f26-update-candidate] Create update => [f26-update-candidate, f26-updates-testing-pending] Pushed to testing => [f26-updates-testing] Submitted to stable => [f26-updates-testing, f26-updates-pending] Pushed to stable => [f26-updates]
Pre-release, instead of pushing to f26-updates, builds are pushed to f26. The f26 tag which is created on the day f26 is branched from rawhide. It thus inherit from all the builds present in rawhide at that time.
In order for koji to ensure that there is no confusion as to which build should go where, it enforces an unique constraint on name-epoch-version-release-arch. Meaning, there can only ever be one build of name-epoch-version-release-arch (NEVRA).
The CI workflow
In a CI workflow, applied to RPMs package in this case, the fact that the package build is the first test performed but by itself not enough to justify a change to be accepted.
In other words, the fact that a new RPM builds is not by itself a deciding factor on whether to accept a change (such as, a new version).
When a change is proposed, we want to build the RPM resulting of this change, test it in different ways and if the tests pass then accept the change.
However, if the tests fail, then we want the packager who proposed the change to look into why the tests failed and act on this. This action may be, adjust the tests, open a bug report upstream explaining that this change broke an expected/relied on behavior or simply, add a patch fixing the bug. Once this action is performed, the packager updates the proposed change and the CI loop kicks in again, the package is built, the tests are ran, the results are returned.
As we can see from this, this means that we need to be able to rebuild a package more than once, with the same NEVRA.
Building for testing
The scratch-build solution
The simplest way for the CI workflow to build something would be to use scratch builds (which are thrown away quickly and thus do not overload the current infrastructure too much), these builds would be what is tested.
- Scratch builds are a known mechanism
- No NEVRA restriction
- Scratch builds in general do not have all the accountability/information required by rel-eng
- This would not work for changes coordinated over multiple repositories
- For example: a new python-requests requiring the new version of python-urllib3
- Since scratch-builds are throw away builds, we cannot include the result of a build into the buildroot of another one.
The side-tag solution
The idea is to make creating side-tags much more lightweight.
Currently a side-tag is just like creating an entirely new repo, just like when is created the f26 repo. It needs to read all the existing RPMs, and create the corresponding repo.
If instead of doing this highly computational task we were to make it more lightweight by allow koji to create side-tag depending other tags (think, create a side-tag which inherits from the existing f26-updates-testing tag), then the repo it would have to create would be empty at the start and would only grow from the RPMs that are being tested.
Within these side-tags, the dist-tag of the package could be overridden, allowing to circumvent the restriction on the NEVRA uniqueness.
So for each proposed changepull-request, we would create a side-tag inheriting the repo of the branch the proposed change is targeting. We could call these side-tags: <Fedora_version>_<pkg>_pr<id>_<seq> to make them easily understandable.
Each proposed change would be built in its own side-tag, proposed change depending on changes from another proposed change would then either:
- have their side-tag inherit from the side-tag of that other proposed change
- be built directly in the side-tag of that other proposed change
Different iteration of changes to the proposed change would use different side-tags having different dist-tags.
- Allows chain builds
- The dist-tag could be overridden in the side-tag, allowing to circumvent the NEVRA restriction
- Koji's side-tag are currently not light-weight, but could be made more light-weight.
- Resulting RPMs will have an odd dist-tag, and therefore could (arguably) not be promoted.
The content-generator solution
Content generators are a way for koji to import artifact built elsewhere. It is currently used for containers that are built in OSBS and imported into koji if the build passes.
https://koji.fedoraproject.org/koji/buildinfo?buildID=872131 is an example of such a build.
Basically, by invoking koji container-build, the package created a buildContainer task (https://koji.fedoraproject.org/koji/taskinfo?taskID=18554928) that itself kicked a createContainer task per arch (https://koji.fedoraproject.org/koji/taskinfo?taskID=18554930). If the build passes, it is imported into koji via content-generator.
We could therefore imagine a 3rd party application that: Kick off a build in mock upon request Include in the build root builds required (thus allowing chain-build) Upon notification, imports the build into koji
- Allows chain build
- No need for rebuilds
- Included in koji
- Requires writing the koji tooling (though we can likely build upon the OSBS work)
- Requires writing this 3rd party application that will build the artifacts upon demand
- If the user submits a build in koji before the CI register its, the CI build won't be allowed to be registered thus going back to the current situation. A solution would be to prevent direct interaction between users and koji but this has other implications.
- Probably the most time-consuming solution
The postponed import solution
The idea of this approach is to make importing a build into the koji DB be a separate task. Currently when a build is done, the import is kicked of by the hub via a xmlrpc call. This leads to some issues with deploying a proxy on the top of koji as the xmlrpc call will not return anything up until that import is finished and for imports such as the texlive, this can take a very long time (we're speaking around 2 hours here). Moving this process into its own task would mean two things: Having a proxy in front of koji would become much easier and there wouldn't be this potentially very long xmlrpc call not returning any info With some sort of flag, we could “postpone” that task up until the build is validated
- Potential fix for 2 issues
- Allows rebuilding the same NEVRA multiple time (since the uniqueness is claimed by the import into the DB)
- Allows re-using the same path/code/mechanism to build for testing and for real
- Requires some work on koji
- Chain build support would not be easy since, as the build would not be in the database koji would basically not know about it and thus would not be able to include it in the buildroot of other builds
The build namespace solution
This is a feature that has been considered for Koji for a little bit and there is even a start of work that we would need to build upon but that gives us a base. The idea is to add a namespace to the build table, so making the uniqueness constraint be namespace-NEVRA.
To allow rebuilding the same package multiple times while enforcing a namespace, we would rely on a postgresql trick that: “”The null value represents an unknown value, and it is not known whether two unknown values are equal. This behavior conforms to the SQL standard.”” (https://www.postgresql.org/docs/9.0/static/functions-comparison.html)
So by creating a NULL namespace and putting the CI builds into this namespace, we can rebuild a package as many times as necessary while keeping them all in the database and thus allowing the pipeline to import any one of them (but of course, since “importing” would mean giving them an actual namespace, only one of them can be imported).
- We can build a single NEVRA as many times as desired during the CI pipeline
- Work started, “just” needs to be finished
- Allow promoting a CI/test build into a real build
- Since all the builds would be present in the database, chain-build can be achieved using side-tags
- Importing the CI build into an actual namespace will be tricky since we would have to ensure all the build that were present in its buildroot either:
- Are imported at the same time
- Are already imported and if not raise an error
- The point raise as a “con” is valid one, for all solutions
Considering the different solutions presented here, the time at hand to implement it, I believe the best solution is the build namespace approach described last combined with improvements on the creation and management of side-tags.
It gives us the flexibility desired and required for CI builds, gives us the recording of information desired for auditing and satisfy legal requirements and coupled with improvements on side-tags, it gives us the possibility to have CI chain-builds, thus allowing depending builds to be done together.
Long term, I believe moving the current RPM building code out of koji-core and into an content generator would help making koji more flexible to other changes, but I think this is out of the scope of the current proposal.