Modularity/Architecture/Module versioning and branching

Because one version is never enough, but having more is complicated!

Author: Stephen Tweedie

Summary

The primary core principle of Modularity is that our content should be released not as large, monolithic distribution releases, but in units of smaller modules designed to be assembled in different combinations.

Closely related is a second core principle: we should be able to release new major versions of these modules on their own schedules to serve product requirements; they should not all be tied to the master cadence of today’s major release cycles. We already do this to some extent with Extras and SCLs within Fedora.

So, at a deep level, Modularity requires us to branch and version modules independently of each other.

Yet we need to control this complexity: the engineering involved has to be sustainable, and the combinations we offer to the user need to be manageable. This is especially true as different tools represent branches in different ways (eg. bugzilla represents branches as the “version” field for a product, and also has version-specific flags such as the 7.1.z=? flags; brew has brew tags which acts as branches, etc.)

This leads to some complex constraints, which we will explore in this document.

Ultimately, we can identify many distinct variants on branching: sometimes different parts of the release pipeline end up with multiple different views of the underlying branches. Just consider Fedora 24.y branches: these either look like a continuous update stream to the end user, or some of the branches end up having distinct lifespans.

So while a single consistent end-to-end branching model for any module is possible in simple cases, it is unlikely to satisfy all the product demands for complex release structures. This suggests an approach to branching involving:

A centralised representation of the current branching at any point in time;
Retain separate branch definitions in our various tools such as bugzilla, brew, errata-tool etc, as we have today;
A flexible scripting approach to automating creation of new branches and key transitions on existing branches (dev to beta to release etc), to keep branches on the different tools synchronised.

This splits mechanism—the central branching repository and the branch definitions in the different tools—from policy—the specific branches and transitions actioned by the scripts.

It is important to remember that the mechanism needs to be flexible enough to represent any potential desired branching structure; but this does not mean that all modules have to have complex branching we can still (and we should) adopt branching policy that is as simple as possible for any given module.

Basic branching terminology

First, though, note that the very word “version” is overloaded here. It can refer either to a completely separate branch of some module, or to a single specific instance or compose of a module. To keep terminology straight, we need to distinguish between:

A version branch: or more simply just a branch: a completely new, carefully planned version stream for a module. This might correspond to a major Fedora release, or a new Fedora SCL release. Creating a new branch should come with a new release target for the branch (or branches if we’re releasing multiple modules simultaneously.)

A version branch may correspond to a new major version of the module, but there may be exceptions: for example, when we add specific features as a side-branch of an existing major version. The new branch may differ by SLA (eg. EUS branches off an existing RHEL version branch) or by feature (eg. the RHEV-H / RHOSP version of the RHEL virt stack, or an “f-stream” branch giving early access to a specific new feature planned to be released in a subsequent update to RHEL.

Users must opt into a new version branch. By definition here, creating a new version branch must have no impact on users who have already enabled an existing branch of a module. If a user provisions a new environment and asks for the most recent version branch, then they may get the new version branch automatically; but no existing environments will transparently receive content for the new version branch.

A point-in-time version or instance representing a single compose of a module on a single version branch, built by and identifiable by the compose ID of the task used to compose the module within the build system.

Such a point-in-time update may be just a scratch build, or may be internal-only and not released to the user. But once it is released, it forms a new update for that version branch of the module. Multiple released point-in-time updates therefore form an update stream over time for that branch.

Properties of branches and update streams

ABI Compatibility: Updates within a single update stream are likely to maintain ABI backwards compatibility in most cases. Users should be able to consume updates from an update stream without being concerned about breaking applications that depend on that module. A major change introducing an incompatible ABI would normally be expected to require a new version branch.

But this is ultimately a policy decision: there is nothing technical to stop ABI breakage within a version stream. For example, the current RHEL-7 extras stream may break compatibility from time to time, and has done so in the past in certain container platform packages. Our tools should be able to detect incompatible ABI changes as far as possible, but should not prevent them if we have an exceptional case where such a change is desired.

(ABI compatibility here includes anything that may have a compatibility on user or application compatibility, including for example semantics of configuration files, library ABIs, command line option handling and error codes, and so on.)

Constraining the scope of ABI dependency: As preserving ABI on updates is a burden which imposes constraints on our maintenance of a module within a single version branch, we would like the ability to limit the parts of a module to which ABI stability applies. We currently define which packages within a module form the external ABI of the module: this is defined by the maintainer of a given module’s metadata. Conversely, packages not declared as external are implicit internal implementation details of the module.

Defining the external ABI as a set of packages will allow us to:

Rebase internal packages without constraint from ABI guarantees, removing overhead from the module maintenance burden over time;
Verify that layered modules or applications depend only on packages defined as external ABI, by checking rpm dependency chains

Lifecycle: Given that we define no formal policy on ABI lifecycle—rather leaving this up to policy—it follows that there is no strong requirement that version numbers of packages within a single update stream have to follow any particular pattern. We can easily rebase a package within an update stream, even adding new features, as long as any claimed backwards compatibility is preserved.

We do need to be concerned about whether 3rd-party application certification is expected to be preserved when such an application depends on a module’s version branch containing rebased packages. This is an important question, and we need to add tooling and policy around it; but for now this is primarily a policy question, and beyond the scope of this document. Different modules may have different appetite for risk and rebases, and hence have different policy around certification.

Parallel Availability: The update streams for different module version branches must be able to coexist in our pipeline and released content, without interfering with each other. If a given base system install has both httpd-2.2 and httpd-2.4 available in different version branches, then it is important that these remain independent.

The update streams must not interfere with each other. If httpd-2.2 is installed, then updating it via yum or dnf should update it to the most recent version in the httpd-2.2 update stream, and must not automatically update it to 2.4. Any dependencies brought in by either must also prevent such interference.

And yet if a certain package does support parallel installation of different version branches at the same time (eg. RHSCLs), then the separate installed versions at any time must each be updatable by their own specific update stream.

Coherency of Branching

There are many advantages to be had in a single, coherent view of the branching structure for a given module.

Maintainers and customers alike have to deal with branches in multiple places:

Internally we plan, develop and build content in bugzilla, dist-git and koji/brew, then release it through the errata tool. All of these tools share a common understanding of the various product branches (rhel-x.y etc.) and work flows naturally between them.
Customers consume products knowing which branch they are on; they have the option to choose between branches (RHEL 7.y with EUS options to stay on an extension of an old branch), and they report bugs and review errata corresponding to those branches

It also seems highly desirable to automate creation of branches, especially when we consider a future with many more modules than we have today, each with their own branches.

But the idea of a clean, consistent view of branching that is unified end-to-end falls down in several places. Some of the difficulties include:

Utility side-branches: Some of our tools have slight variants on the main branch naming to support specific workflow requirements.

For example, in CI we can have staging branches alongside the production release branches, and in brew we have scratch builds; these follow the main product branching but are intended for developer use cases, rather than automatically being candidates for product release.

We have candidate tags, beta tags and release tags in brew, indicating packages on various different stages of the lifecycle from development to release. Beta branches in general represent a special case here.

Multiple views of branching: There are several places where two different parts of the release pipeline can treat branching different from each other. Two important examples here include minor version branching and per-product views of a component:

Minor-version branching: Our internal build pipeline considers (eg) RHEL-7.0, 7.1 and 7.2 to be quite distinct branches. This is deliberate: it allows us to work on development of the next minor release while still releasing updates to the existing minor release. It also enables us to maintain long-life support for older minor releases, and allows EUS customers to subscribe to those long-life branches.

But this introduces an inconsistency between the development branching model and the user consumption model for normal RHEL updates. The developer model includes distinct discrete minor update branches; the consumption model is a single ongoing stream of updates, consisting of smaller z-stream fixes punctuated by less frequent, larger updates.

Note, we do not know if our desired branching will look like this in a future modular RHEL. We may choose to simplify things here. But we need to be at least prepared for such a branching model; indeed, some of the models under consideration for future RHEL kernel branching may be even more complex that what we have today, with separate branches for different severity of erratas for a single release.

Note that not all modules will need or want this complexity. Flexibility is important here. We need to automate relatively simple branching models for modules similar to SCLs that live as self-contained bundles of content, as well as more complex branching structures to enable future PM requirements.

Per-product views of branching: We have significant issues when two different products share the same branching of the same components. A classic example is the RHEL base release repackaged and shipped as RHEV-H, part of the RHEV virtualisation product line. Much of the base RHEL release is included with RHEV and is built as part of the RHEV-H hypervisor build.

The problem occurs when a customer tries to report a bug against such a shared package. If we have a bug in, say, kernel for RHEV-H, the customer bug is typically filed in bugzilla against the RHEV-H product. But the kernel is actually maintained inside RHEL, not RHEV.

So we need to manually clone the bug from RHEV to RHEL, then fix the bug in RHEL, push it as a RHEL update, and finally duplicate the RHEL errata to RHEV.

The underlying problem here is that a shared component like this has two different views: a developer view which is based around maintaining the single, shared internal build branch; and an external view which is based around the multiple product-specific update streams in which the component appears.

It seems difficult and complex to come up with a single common branching model that reflects these two views. Different products legitimately need to have different branching numbering and policies. We may simply have to live with the developer and customer views of branching being different.

But we still need to scale our processes when these two branching views differ. A goal of modularity is to make it easier to share modules flexibly between different products or applications, so this scenario is only likely to become more common. Relying on manual effort to keep tooling consistent across different product branches is unlikely to scale; reconciling these two views may require either changes to our tools, or automation to synchronise issues across shared branches.

Branch fluidity: Branching can vary over time depending on subscription/entitlement and lifecycle. A good example here is the RHEL y-stream release train. Internally, rhel-7.0, rhel-7.1 and rhel-7.2 are all distinct branches in dist-git, brew and bugzilla. And yet a customer’s update stream does not reflect this: when 7.2 is released, a yum update of a 7.1 RHEL installation will automatically download and install all new 7.2 content. So the “current branch” for a standard RHEL subscription changes over time as y-stream updates occur.

Note that this only changes for the end-user. A developer still sees 7.0, 7.1 and 7.2 as distinct branches; it’s the update stream for subscribers to that content that has to change when a new minor update occurs.

Furthermore, the 7.1 branching itself changes over time. We use rhel-7.1 during 7.1 development. But once 7.1 is released, things change: the devel branch in now 7.2; the 7.1 is still branch is now used for commits queued for z-stream errata, but the brew tag moves to rhel-7.1-z and we start using 7.1-Z flags in bugzilla.

And even after 7.2 is released, 7.1-z updates may still be available to some customers but not all, depending on whether we have an EUS branch for 7.1 and whether the customer is entitled to EUS and has chosen to enable it for a given system.

So even just for a single update stream—RHEL-7 latest—our definitions of development, beta, scratch and errata branches, and the representations of those in various different tools, are complex and dynamic. We need to consider the concept of a branching transition, when an event occurs that requires coordinated changes in the status, names, or flags for a single ongoing version branch of a module. The alpha/beta/RC release or initial public general release of a module, being superseded by a newer version, entering update stream after release, or end-of-life could all be possible transition events for a module.

Naming policy: Naming for existing branches is also flexible and inconsistent today. The rhel-7.1 branch is a natural successor to rhel-7.0: you update from one to the next seamlessly. Rhscl-2.2 added new packages but did not obsolete anything in rhscl-2.1. Fedora does not have x.y minor version numbers at all, and uses a completely different name (rawhide) for ongoing development.

Such inconsistency is not a problem needing fixed: rather, it reflects that different products and different modules can have fundamentally different product needs. The naming needs to reflect those needs, not constrain them. Naming policy is something that likely needs driven more by PM requirements than by technical modularity implementation. Flexibility here is paramount.

Consistency of release: Finally, we need to consider the granularity of branches. The purpose of modularity is to allow us to release modules independently from a single master release cadence. But do we really want all modules to be released without any synchronisation or common branching at all?

History suggests we do not. We currently have consolidated batch errata across all of RHEL-7; RHEL itself, plus Extras (including all our container support), follow this same schedule. We try to synchronise feature development between container infrastructure and core RHEL.

In the future we likely have many completely-decoupled modules for additional content outside the base RHEL runtime platform. But we may still eventually decide that we want to have synchronised releases of new content across different modules; this is very much how software collections work today, where new SCL releases are still launched together as a common RH-SCL version launch.

So while modules can have independent branches, we still need the ability to drive a common branching structure across a set of modules when that is needed for product release requirements. First-class support for such a consolidated release is absolutely necessary; to devolve the distribution into an unmanaged, completely-uncoordinated set of independent modules is likely unsustainable for both engineering and customer alike.

Managing this branching complexity

Given that the exact and branching policy for a module is currently inconsistent between products, needs to remain flexible, and changes over time, how do we manage this? The question is especially significant given that we are looking at significant changes to the way we divide and release the distribution in the future; our future branching model is currently completely unknown.

This suggests that we should not try to formalise a branching and naming policy at all. But we must eventually have automation for the creation of branches and for branch transitions, especially given that release consistency may require us to coordinate new branches across many modules simultaneously. And our tools still need consistent views across this complex branching structure.

Separation of policy from representation

This suggests that we need:

A canonical definition of our modules and their branches at any point in time, including the way those branch names are represented in different tools:
Consistent use of that canonical branching structure within our tools, but with
Flexible, scripted events to drive changes in the branching.

We can do this by separating the central representation of branching (eg. in PDC) from the mechanism used to define and update that branching.

Changes in branching also need to be orchestrated: we should not define a new branch and allow a developer to start building on that branch, before the branch has been created in bugzilla, dist-git, brew etc. There are many tools that could be used here: ansible is just one such tool. The point here is to identify that as a separate concern. Automation here is important if we want to be able to support coordinated release branching across a set of modules.

For now, we are dealing with a relatively simple branching structure, building simple modules out of the latest Fedora. We don’t need complex branching policy right now. But separating out representation from policy allows us to start with a simple branching structure initially, and still lets us define, and script, more complex, product-specific branching requirements later, while having those consistently represented in a central database that our tools can refer to and agree on.

Forking a new branch

We have mentioned that creating a new branch for a module involves branching multiple different tools: we need a branch for the module in dist-git, new branches for its components, and corresponding branches in bugzilla; we may need new tags in brew.

This implies that the branching for a module is (usually) the same thing as the branching for all the component packages of that module.

But sometimes we will not want to branch all packages; we may want a variant branch of a module which overrides just some of the packages, and which otherwise inherits the content (including new content) from its base branch.

There are many examples which would suit such a inheriting branch. The RHEL f-stream model which allows early access to new features prior to a minor update, is an example. Another might be the RHOSP version of the virtualisation stack, which contains a version of kvm-qemu with newer features but which otherwise follows base RHEL. The model also works for scratch or staging branches, where we can build and test updates to an existing branch as needed to suit internal developer needs.

This suggests that we want to include tooling support for such an inheriting branch. Technically, this might involve creating new branches for only a subset of the packages of a module; and recording the base module from which we pull other packages during a module compose.

Converging branches

Just as important as forking a new branch is converging existing branches. In the RHEL f-stream model, we allow a product early back-ported access to a new feature scheduled for the next RHEL minor release. The f-stream is the early-access branch; the intent is that when the next minor RHEL release occurs, it introduces that back-ported feature into the mainline RHEL stream, and the f-stream is no longer needed: any product depending on that feature moves back off the f-stream branch and onto the mainline.

Extending this to a modular build, we can imagine a layered product needing a new feature within any module in our stack. If that module does not plan the feature to be released in time, we can fork a specific version of the module to serve the needs of the one product needing the new feature; but if and when that feature is released in some mainline version of the module, we want the ability to move the layered product off the forked feature branch and back onto mainline.

There are likely to be many complexities here; the important point is to imagine up-front that forking a new branch is only half the picture, it will be useful to have tooling support for converging branches again afterwards too.

Managing unsynchronised stacks of branches

Modules can depend in turn on other modules. We have defined a “stack” as the entire tree of modules needed to satisfy dependencies for one top-level module or application. But as we combine modules in this way, not all those modules will have the same branches or lifecycle.

We already deal with this today in most of our layered products, which need to be built for a specific RHEL release, but which are not released on the RHEL schedule. Even parts of the wider RHEL product are like this: software collections have their own release schedule, for example.

So when we have multiple, different, unsynchronised branching models for different modules within a stack, how do we know exactly which branches of which modules we need to combine together? We can agree that we need to constrain this complexity, and define specific subsets of modules which we will test and support together. The issue is where, and how, to define this.

This is an issue we still need to solve. There two obvious places to hold this structure: in the release that defines multiple modules and their combined release schedules; or by defining specific branch dependencies in each module’s own module metadata.

Both have pros and cons. Defining specific branch dependencies in a module’s metadata helps by keeping more of the module’s defining structure in one place. However, the downside is that it becomes impossible to use that same metadata in multiple places without changing it: eg. building a single module from the same module source on multiple buildroots is impossible if the module source itself defines its buildroot dependency.

So this is a topic for future consideration.

Constraints on branching

The above seems to say “branching is hard, let’s not assume what it looks like but just store a flexible representation that we can adapt as we need.”

That’s true to some extent… but there are concerns we can anticipate that we need to handle in our branching structure. Having covered the fundamental principle that branching policy needs to remain flexible, let’s look at some of the issues we need to handle as we define that policy.

Splitting a binary package build over multiple modules: This is something that is surprisingly common in RHEL. It happens typically because we do not want to give the same level of support to all sub-packages from a single upstream source.

Examples might be when we want to include a library to support our own applications, but do not want to give it full support for end-users; we might include the library itself in RHEL, but split out the ability to develop against it (the include files, static libraries etc. that typically land in a -devel binary rpm) into, say, Optional. Or we might want to reserve certain functionality for specific products: eg. providing only guest hardware devices in cloud images, without offering hardware enablement with the full complement of kernel hardware drivers.

Can we do this naturally in a modular build chain? Clearly it breaks any assumption that a module can be both compiled and composed in complete independence from any other: if a package build ends up in multiple modules, then the compile phase of building those modules is now linked. We need to determine how important it is to support this.

But it is still quite possible to achieve, if the modules which are to share binaries have matching branches. In that case, module composes can always agree on which brew/koji branches [tags] to consume packages from. So this may be fairly easy for modules which are part of a single consolidated release, as defined above; it would be fair to restrict this possibility to that case.

Building a module in multiple build roots: Does a single module source branch result in a single composed binary branch? Or do we build that same source multiple times against different base distribution build roots?

Clearly, branching becomes enormously more complicated if we need to support builds for multiple different build roots in a single branch. The idea of a single coherent branching structure from git to release is broken if we have multiple output branches from a single input branch.

But the entire point of ABI forwards compatibility is to avoid the need to do this: to run a module on a set of major runtimes, it should in theory be necessarily simply to build it on the oldest runtime in that set. A module built on RHEL-6 should run on RHEL-7 or -8, as long as it is using only dependencies with tier-1 API stability guarantees.

So before working through the complexities of commit-once, compile-multiple-times, it will be important to determine to what extent we can simply depend on ABI compatibility to ensure a module works against multiple runtimes.

Search