From Fedora Project Wiki
(add release note ticket)
 
(27 intermediate revisions by 3 users not shown)
Line 1: Line 1:
= Automatic cloud-init reboots after updates =
= Automatic Cloud Reboot On Updates =


{{Change_Proposal_Banner}}


== Summary ==
== Summary ==
<!-- A sentence or two summarizing what this change is and what it will do. This information is used for the overall changeset summary page for each release. Note that motivation for the change should be in the Benefit to Fedora section below, and this part should answer the question "What?" rather than "Why?". -->
Cloud users can provide cloud-init metadata when creating a Fedora cloud instance and that metadata can contain instructions to update all packages on the system and reboot the system if any of those updated packages need a reboot to go into effect. Fedora cloud instances should write the `/var/run/reboot-required` file if a reboot is needed after a dnf update so that cloud-init can reboot the instance.
 
This issue originally surfaced in [https://bugzilla.redhat.com/show_bug.cgi?id=1275409 RHBZ 1275409].


== Owner ==
== Owner ==
Line 11: Line 12:
This should link to your home wiki page so we know who you are.  
This should link to your home wiki page so we know who you are.  
-->
-->
* Name: [[User:FASAcountName| Your Name]]
* Name: [[User:mhayden| Major Hayden]]
<!-- Include you email address that you can be reached should people want to contact you about helping with your change, status is requested, or technical issues need to be resolved. If the change proposal is owned by a SIG, please also add a primary contact person. -->
<!-- Include you email address that you can be reached should people want to contact you about helping with your change, status is requested, or technical issues need to be resolved. If the change proposal is owned by a SIG, please also add a primary contact person. -->
* Email: <your email address so we can contact you, invite you to meetings, etc. Please provide your Bugzilla email address if it is different from your email in FAS>
* Email: major@redhat.com
<!--- UNCOMMENT only for Changes with assigned Shepherd (by FESCo)
<!--- UNCOMMENT only for Changes with assigned Shepherd (by FESCo)
* FESCo shepherd: [[User:FASAccountName| Shehperd name]] <email address>
* FESCo shepherd: [[User:FASAccountName| Shehperd name]] <email address>
-->
-->


== Current status ==
== Current status ==
[[Category:ChangePageIncomplete]]
[[Category:ChangeAcceptedF39]]
<!-- When your change proposal page is completed and ready for review and announcement -->
<!-- When your change proposal page is completed and ready for review and announcement -->
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler -->
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler -->
Line 30: Line 30:
<!-- [[Category:SystemWideChange]] -->
<!-- [[Category:SystemWideChange]] -->


* Targeted release: [https://docs.fedoraproject.org/en-US/releases/f<VERSION>/ Fedora Linux <VERSION>]
* Targeted release: [https://docs.fedoraproject.org/en-US/releases/f39/ Fedora Linux 39]
* Last updated: <!-- this is an automatic macro — you don't need to change this line -->  {{REVISIONYEAR}}-{{REVISIONMONTH}}-{{REVISIONDAY2}}  
* Last updated: <!-- this is an automatic macro — you don't need to change this line -->  {{REVISIONYEAR}}-{{REVISIONMONTH}}-{{REVISIONDAY2}}  
<!-- After the change proposal is accepted by FESCo, tracking bug is created in Bugzilla and linked to this page  
<!-- After the change proposal is accepted by FESCo, tracking bug is created in Bugzilla and linked to this page  
Line 38: Line 38:
ON_QA -> change is fully code complete
ON_QA -> change is fully code complete
-->
-->
* [<will be assigned by the Wrangler> devel thread]
* [https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/ELS4GX5AUN62RQBY3GKRB3YFMMALIFRM/ devel thread]
* FESCo issue: <will be assigned by the Wrangler>
* FESCo issue: [https://pagure.io/fesco/issue/3010 #3010]
* Tracker bug: <will be assigned by the Wrangler>
* Tracker bug: [https://bugzilla.redhat.com/show_bug.cgi?id=2233245 #2233245]
* Release notes tracker: <will be assigned by the Wrangler>
* Release notes tracker: [https://pagure.io/fedora-docs/release-notes/issue/1021 #1021]


== Detailed Description ==
== Detailed Description ==
<!-- Expand on the summary, if appropriate. A couple sentences suffices to explain the goal, but the more details you can provide the better. -->
 
Fedora cloud instances use cloud-init to do the initial configuration of the instance. This includes setting up networking, assigning a hostname, adding users/groups, and arbitrary scripts. There are also two options that you can pass to cloud-init that are important for this change:
 
* `package_update`: If set to `true`, all installed packages are immediately updated on first boot
* `package_reboot_if_required`: If set to `true`, and the `package_update` step wrote to `/var/run/reboot-required`, reboot the system immediately after updating packages
 
📚 For more details, see cloud-init's module reference for `[https://cloudinit.readthedocs.io/en/latest/reference/modules.html#package-update-upgrade-install package_update]`.
 
🚨 '''WAIT A MOMENT. ARE WE TALKING ABOUT REBOOTING EVERY CLOUD INSTANCE ON BOOT?''' 🚨 No! This change would require all three of these things to happen before a reboot occurs:
 
* User provides `package_update: true` on instance creation
* '''AND''' user provides `package_reboot_if_required: true` on instance creation
* '''AND''' `tracer` notices that at least one of the packages need a reboot to go into effect
 
🤔 '''Where does this `/var/run/reboot-required` file come from?''' On Debian and Ubuntu systems, `apt` automatically writes to `/var/run/reboot-required` if a reboot is needed after a package update. From there, `cloud-init` looks for the file ([https://github.com/canonical/cloud-init/blob/6d09df5e4786a2a6c79d6098ab413c93b205221c/cloudinit/config/cc_package_update_upgrade_install.py#L119-L134 relevant cloud-init code]) and if present, reboots the system immediately.
 
✏️ '''How do we write this file on Fedora?''' Fedora systems have a package called `tracer` and a corresponding dnf plugin, `python3-dnf-plugin-tracer`, that analyzes `dnf` updates and provides recommendations on reboots or user logouts to bring updates into effect on the system. A recent [https://github.com/FrostyX/tracer/pull/196 pull request] added support for writing the `/var/run/reboot-required` file when a system reboot is recommended. The `cloud-init` tool can read this file after a package update and reboot if needed.
 
🔎 '''What does `tracer`'s output look like?'''
 
    [root@tracer-testing ~]# tracer
    You should restart:
    * Some applications using:
        sudo systemctl restart NetworkManager
        sudo systemctl restart auditd
        sudo systemctl restart chronyd
        sudo systemctl restart dbus-broker
        sudo systemctl restart qemu-guest-agent
        sudo systemctl restart sshd
        sudo systemctl restart systemd-journald
        sudo systemctl restart systemd-logind
        sudo systemctl restart systemd-oomd
        sudo systemctl restart systemd-resolved
        sudo systemctl restart systemd-udevd
        sudo systemctl restart systemd-userdbd
   
    * These applications manually:
        (sd-pam)
   
    Additionally, there are:
    - 3 processes requiring restart of your session (i.e. Logging out & Logging in again)
    - 1 processes requiring reboot
    [root@tracer-testing ~]# cat /var/run/reboot-required
    Tracer says reboot is required
 
📋 '''What do we need to do?''' Add the `python3-dnf-plugin-tracer` plugin to Fedora cloud images. No additional configuration is necessary. This action pulls in five packages that are about 2.1MB after installation:
 
    =======================================================================================
    Package                              Arch      Version            Repository  Size
    =======================================================================================
    Installing:
    python3-dnf-plugin-tracer            noarch    4.1.0-1.fc38        fedora      14 k
    Installing dependencies:
    python3-dnf-plugins-extras-common    noarch    4.1.0-1.fc38        fedora      69 k
    python3-psutil                        x86_64    5.9.2-2.fc38        fedora    271 k
    python3-tracer                        noarch    0.7.8-5.fc38        fedora    172 k
    tracer-common                        noarch    0.7.8-5.fc38        fedora      22 k
   
    Transaction Summary
    =======================================================================================
    Install  5 Packages
   
    Total download size: 547 k
    Installed size: 2.1 M


== Feedback ==
== Feedback ==
<!-- Summarize the feedback from the community and address why you chose not to accept proposed alternatives. This section is optional for all change proposals but is strongly suggested. Incorporating feedback here as it is raised gives FESCo a clearer view of your proposal and leaves a good record for the future. If you get no feedback, that is useful to note in this section as well. For innovative or possibly controversial ideas, consider collecting feedback before you file the change proposal. -->
 
One of the other ideas was to patch `cloud-init` to run `tracer` directly and avoid the `/var/run/reboot-required` file altogether. That would require a lot of work upstream in `cloud-init` to enable the functionality and we would still need the same set of packages installed in Fedora anyway. 🥵


== Benefit to Fedora ==
== Benefit to Fedora ==
<!-- What is the benefit to the distribution?  Will the software we generate be improved? How will the process of creating Fedora releases be improved?
 
      Be sure to include the following areas if relevant:
      If this is a major capability update, what has changed?
          For example: This change introduces Python 5 that runs without the Global Interpreter Lock and is fully multithreaded.
      If this is a new functionality, what capabilities does it bring?
          For example: This change allows package upgrades to be performed automatically and rolled-back at will.
      Does this improve some specific package or set of packages?
          For example: This change modifies a package to use a different language stack that reduces install size by removing dependencies.
      Does this improve specific Spins or Editions?
          For example: This change modifies the default install of Fedora Workstation to be more in line with the base install of Fedora Server.
      Does this make the distribution more efficient?
          For example: This change replaces thousands of individual %post scriptlets in packages with one script that runs at the end.
      Is this an improvement to maintainer processes?
          For example: Gating Fedora packages on automatic QA tests will make rawhide more stable and allow changes to be implemented more smoothly.
      Is this an improvement targeted as specific contributors?
          For example: Ensuring that a minimal set of tools required for contribution to Fedora are installed by default eases the onboarding of new contributors.


    When a Change has multiple benefits, it's better to list them all.
This change allows Fedora cloud instances to behave in the same way that Debian-based instances already behave. When users request package updates with a reboot now, `cloud-init` performs the update but never reboots the system. This is an unexpected and confusing result for users who come to Fedora from other distributions.


    Consider these Change pages from previous editions as inspiration:
Rebooting automatically could also reduce the attack surface of an instance that just came online since it would immediately reboot to put all package updates into effect on the system. This reduces the time that an unpatched instance is online prior to being fully patched.
    https://fedoraproject.org/wiki/Changes/Annobin (low-level and technical, invisible to users)
    https://fedoraproject.org/wiki/Changes/ParallelInstallableDebuginfo (low-level, but visible to advanced users)
    https://fedoraproject.org/wiki/Changes/VirtualBox_Guest_Integration (primarily a UX change)
    https://fedoraproject.org/wiki/Changes/NoMoreAlpha (an improvement to distro processes)
    https://fedoraproject.org/wiki/Changes/perl5.26 (major upgrade to a popular software stack, visible to users of that stack)
-->


== Scope ==
== Scope ==
* Proposal owners:
* Proposal owners: This change is fairly isolated and only affects Fedora cloud users who request package updates followed by a reboot in their `cloud-init` metadata.
<!-- What work do the feature owners have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->


* Other developers: <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Other developers: N/A
<!-- What work do other developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->


* Release engineering: [https://pagure.io/releng/issues #Releng issue number] <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Release engineering: N/A
<!-- Does this feature require coordination with release engineering (e.g. changes to installer image generation or update package delivery)?  Is a mass rebuild required?  include a link to the releng issue.
The issue is required to be filed prior to feature submission, to ensure that someone is on board to do any process development work and testing and that all changes make it into the pipeline; a bullet point in a change is not sufficient communication -->


* Policies and guidelines: N/A (not needed for this Change) <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Policies and guidelines: N/A
<!-- Do the packaging guidelines or other documents need to be updated for this feature?  If so, does it need to happen before or after the implementation is done?  If a FPC ticket exists, add a link here. Please submit a pull request with the proposed changes before submitting your Change proposal. -->


* Trademark approval: N/A (not needed for this Change)
* Trademark approval: N/A
<!-- If your Change may require trademark approval (for example, if it is a new Spin), file a ticket ( https://pagure.io/Fedora-Council/tickets/issues ) requesting trademark approval from the Fedora Council. This approval will be done via the Council's consensus-based process. -->


* Alignment with Community Initiatives:  
* Alignment with Community Initiatives: N/A
<!-- Does your proposal align with the current Fedora Community Initiatives: https://docs.fedoraproject.org/en-US/project/initiatives/ ? It's okay if it doesn't, but it's something to consider -->


== Upgrade/compatibility impact ==
== Upgrade/compatibility impact ==
<!-- What happens to systems that have had a previous versions of Fedora installed and are updated to the version containing this change? Will anything require manual configuration or data migration? Will any existing functionality be no longer supported? -->
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->


Since this change only applies to `cloud-init` on the very first boot of the instance, this wouldn't affect a user upgrading from one version of Fedora to the next.


== How To Test ==
== How To Test ==
<!-- This does not need to be a full-fledged document. Describe the dimensions of tests that this change implementation is expected to pass when it is done.  If it needs to be tested with different hardware or software configurations, indicate them.  The more specific you can be, the better the community testing can be.


Remember that you are writing this how to for interested testers to use to check out your change implementation - documenting what you do for testing is OK, but it's much better to document what *I* can do to test your change.
# Ensure you have a cloud image that has an update that needs a reboot (kernel, openssl, etc)
# Boot an instance with the following `cloud-init` user data:


A good "how to test" should answer these four questions:
    #cloud-config
 
    package_update: true
0. What special hardware / data / etc. is needed (if any)?
    package_upgrade: true
1. How do I prepare my system to test this change? What packages
    package_reboot_if_required: true
need to be installed, config files edited, etc.?
2. What specific actions do I perform to check that the change is
working like it's supposed to?
3. What are the expected results of those actions?
-->
 
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->


# Wait for the package updates to finish on the instance and verify that it rebooted after updating


== User Experience ==
== User Experience ==
<!-- If this change proposal is noticeable by users, how will their experiences change as a result?


This section partially overlaps with the Benefit to Fedora section above. This section should be primarily about the User Experience, written in a way that does not assume deep technical knowledge. More detailed technical description should be left for the Benefit to Fedora section.
First, if a user never uses the `package_upgrade` and `package_reboot_if_required` options in their `cloud-init` user data, they won't be affected by this change. These options are not enabled in `cloud-init` by default.


Describe what Users will see or notice, for example:
If a user does enable both of these options, they will see their cloud instance come online, apply updates, and reboot if required. Most cloud providers have very fast reboots, so the delay should not be a problem.
  - Packages are compressed more efficiently, making downloads and upgrades faster by 10%.
  - Kerberos tickets can be renewed automatically. Users will now have to authenticate less and become more productive. Credential management improvements mean a user can start their work day with a single sign on and not have to pause for reauthentication during their entire day.
- Libreoffice is one of the most commonly installed applications on Fedora and it is now available by default to help users "hit the ground running".
- Green has been scientifically proven to be the most relaxing color. The move to a default background color of green with green text will result in Fedora users being the most relaxed users of any operating system.
-->


== Dependencies ==
== Dependencies ==
<!-- What other packages (RPMs) depend on this package?  Are there changes outside the developers' control on which completion of this change depends?  In other words, completion of another change owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate?  Other upstream projects like the kernel (if this is not a kernel change)? -->
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->


Nothing depends on this change.


== Contingency Plan ==
== Contingency Plan ==


<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "Revert the shipped configuration".  Or it might not (e.g. rebuilding a number of dependent packages).  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
* Contingency mechanism: Push to Fedora 40 if the work cannot be done in time
* Contingency mechanism: (What to do?  Who will do it?) N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Contingency deadline: N/A
<!-- When is the last time the contingency mechanism can be put in place?  This will typically be the beta freeze. -->
* Contingency deadline: N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- Does finishing this feature block the release, or can we ship with the feature in incomplete state? -->
<!-- Does finishing this feature block the release, or can we ship with the feature in incomplete state? -->
* Blocks release? N/A (not a System Wide Change), Yes/No <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Blocks release? N/A
 


== Documentation ==
== Documentation ==
<!-- Is there upstream documentation on this change, or notes you have written yourself?  Link to that material here so other interested developers can get involved. -->


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
Guidance for users in a blog post (Fedora Magazine) could be helpful for this change. Many users might not be aware that they had the option to ask for package updates and reboots via `cloud-init` for their Fedora cloud instances.
N/A (not a System Wide Change)


== Release Notes ==
== Release Notes ==
<!-- The Fedora Release Notes inform end-users about what is new in the release.  Examples of past release notes are at https://docs.fedoraproject.org/en-US/fedora/latest/release-notes/ -->
<!-- The release notes also help users know how to deal with platform changes such as ABIs/APIs, configuration or data file formats, or upgrade concerns.  If there are any such changes involved in this change, indicate them here.  A link to upstream documentation will often satisfy this need.  This information forms the basis of the release notes edited by the documentation team and shipped with the release.


Release Notes are not required for initial draft of the Change Proposal but has to be completed by the Change Freeze.  
Fedora cloud instances now automatically reboot when a user requests package updates followed by a reboot on the first boot of the instance. The reboot only occurs if an updated package requires a reboot to go into effect (such as a kernel or critical system library).
-->

Latest revision as of 23:37, 28 September 2023

Automatic Cloud Reboot On Updates

Summary

Cloud users can provide cloud-init metadata when creating a Fedora cloud instance and that metadata can contain instructions to update all packages on the system and reboot the system if any of those updated packages need a reboot to go into effect. Fedora cloud instances should write the /var/run/reboot-required file if a reboot is needed after a dnf update so that cloud-init can reboot the instance.

This issue originally surfaced in RHBZ 1275409.

Owner

Current status

Detailed Description

Fedora cloud instances use cloud-init to do the initial configuration of the instance. This includes setting up networking, assigning a hostname, adding users/groups, and arbitrary scripts. There are also two options that you can pass to cloud-init that are important for this change:

  • package_update: If set to true, all installed packages are immediately updated on first boot
  • package_reboot_if_required: If set to true, and the package_update step wrote to /var/run/reboot-required, reboot the system immediately after updating packages

📚 For more details, see cloud-init's module reference for package_update.

🚨 WAIT A MOMENT. ARE WE TALKING ABOUT REBOOTING EVERY CLOUD INSTANCE ON BOOT? 🚨 No! This change would require all three of these things to happen before a reboot occurs:

  • User provides package_update: true on instance creation
  • AND user provides package_reboot_if_required: true on instance creation
  • AND tracer notices that at least one of the packages need a reboot to go into effect

🤔 Where does this /var/run/reboot-required file come from? On Debian and Ubuntu systems, apt automatically writes to /var/run/reboot-required if a reboot is needed after a package update. From there, cloud-init looks for the file (relevant cloud-init code) and if present, reboots the system immediately.

✏️ How do we write this file on Fedora? Fedora systems have a package called tracer and a corresponding dnf plugin, python3-dnf-plugin-tracer, that analyzes dnf updates and provides recommendations on reboots or user logouts to bring updates into effect on the system. A recent pull request added support for writing the /var/run/reboot-required file when a system reboot is recommended. The cloud-init tool can read this file after a package update and reboot if needed.

🔎 What does tracer's output look like?

   [root@tracer-testing ~]# tracer 
   You should restart:
   * Some applications using:
       sudo systemctl restart NetworkManager
       sudo systemctl restart auditd
       sudo systemctl restart chronyd
       sudo systemctl restart dbus-broker
       sudo systemctl restart qemu-guest-agent
       sudo systemctl restart sshd
       sudo systemctl restart systemd-journald
       sudo systemctl restart systemd-logind
       sudo systemctl restart systemd-oomd
       sudo systemctl restart systemd-resolved
       sudo systemctl restart systemd-udevd
       sudo systemctl restart systemd-userdbd
   
   * These applications manually:
       (sd-pam)
   
   Additionally, there are:
   - 3 processes requiring restart of your session (i.e. Logging out & Logging in again)
   - 1 processes requiring reboot
   [root@tracer-testing ~]# cat /var/run/reboot-required 
   Tracer says reboot is required

📋 What do we need to do? Add the python3-dnf-plugin-tracer plugin to Fedora cloud images. No additional configuration is necessary. This action pulls in five packages that are about 2.1MB after installation:

   =======================================================================================
   Package                               Arch       Version             Repository  Size
   =======================================================================================
   Installing:
   python3-dnf-plugin-tracer             noarch     4.1.0-1.fc38        fedora      14 k
   Installing dependencies:
   python3-dnf-plugins-extras-common     noarch     4.1.0-1.fc38        fedora      69 k
   python3-psutil                        x86_64     5.9.2-2.fc38        fedora     271 k
   python3-tracer                        noarch     0.7.8-5.fc38        fedora     172 k
   tracer-common                         noarch     0.7.8-5.fc38        fedora      22 k
   
   Transaction Summary
   =======================================================================================
   Install  5 Packages
   
   Total download size: 547 k
   Installed size: 2.1 M

Feedback

One of the other ideas was to patch cloud-init to run tracer directly and avoid the /var/run/reboot-required file altogether. That would require a lot of work upstream in cloud-init to enable the functionality and we would still need the same set of packages installed in Fedora anyway. 🥵

Benefit to Fedora

This change allows Fedora cloud instances to behave in the same way that Debian-based instances already behave. When users request package updates with a reboot now, cloud-init performs the update but never reboots the system. This is an unexpected and confusing result for users who come to Fedora from other distributions.

Rebooting automatically could also reduce the attack surface of an instance that just came online since it would immediately reboot to put all package updates into effect on the system. This reduces the time that an unpatched instance is online prior to being fully patched.

Scope

  • Proposal owners: This change is fairly isolated and only affects Fedora cloud users who request package updates followed by a reboot in their cloud-init metadata.
  • Other developers: N/A
  • Release engineering: N/A
  • Policies and guidelines: N/A
  • Trademark approval: N/A
  • Alignment with Community Initiatives: N/A

Upgrade/compatibility impact

Since this change only applies to cloud-init on the very first boot of the instance, this wouldn't affect a user upgrading from one version of Fedora to the next.

How To Test

  1. Ensure you have a cloud image that has an update that needs a reboot (kernel, openssl, etc)
  2. Boot an instance with the following cloud-init user data:
   #cloud-config
   package_update: true
   package_upgrade: true
   package_reboot_if_required: true
  1. Wait for the package updates to finish on the instance and verify that it rebooted after updating

User Experience

First, if a user never uses the package_upgrade and package_reboot_if_required options in their cloud-init user data, they won't be affected by this change. These options are not enabled in cloud-init by default.

If a user does enable both of these options, they will see their cloud instance come online, apply updates, and reboot if required. Most cloud providers have very fast reboots, so the delay should not be a problem.

Dependencies

Nothing depends on this change.

Contingency Plan

  • Contingency mechanism: Push to Fedora 40 if the work cannot be done in time
  • Contingency deadline: N/A
  • Blocks release? N/A

Documentation

Guidance for users in a blog post (Fedora Magazine) could be helpful for this change. Many users might not be aware that they had the option to ask for package updates and reboots via cloud-init for their Fedora cloud instances.

Release Notes

Fedora cloud instances now automatically reboot when a user requests package updates followed by a reboot on the first boot of the instance. The reboot only occurs if an updated package requires a reboot to go into effect (such as a kernel or critical system library).