From Fedora Project Wiki

Virt Device Failover

Summary

Support for transparent failover between an assigned and an emulated device, allows enabling the migration and overcommit dynamically, while still gaining the performance benefits of device assignment and without disrupting the guest operation.

Owner

Current status

  • Targeted release: Fedora 20
  • Last updated: 2013-03-15
  • Percentage of completion: 50%

Detailed Description

For virtual machines, device assignment is the best option for performance. However, when a device is assigned to a VM, both migration and memory overcommit are currently disabled.

This feature aims at removing the performance/features tradeoff, by switching to an emulated device in a way that is almost transparent to users, for configurations where both host and guest are Fedora.

Fedora should detect that the emulated device serves as a failover for the assigned device. When requested by the hypervisor, it will stop and eject the assigned device, switching to failover. After this point, migration and memory overcommit are possible, while device configuration is preserved. Once e.g. migration completes, the reverse switch can take place.

Thus the device is controlled by:

  • before migration: device specific driver loaded in guest
  • during migration: driver loaded in host, virtio or emulated device driver loaded in guest
  • after migration: device specific driver loaded in guest

At the kernel level, for networking, this can be done by and creating team (or a bond) in a failover configuration, and for storage, using multipath, on top of both the assigned and the emulated device.

Benefit to Fedora

Complex virt setups now have less operational caveats, which makes things simpler for users.

Scope

Work left to do:

  • kvm needs to be extended to notify the guest that the two devices are setup in a fallback configuration
    In particular, add support for sending dbus commands to qemu-ga.
    Need to configure security policy appropriately to allow control of what's allowed, cleanly.
  • For networking, NetworkManager in fedora will support bonding:
    https://fedoraproject.org/wiki/Features/NetworkManagerBonding
    and teaming
    https://fedoraproject.org/wiki/Features/NetworkManagerTeaming
    NM can be controlled using dbus.
  • For storage, need to setup device-mapper-multipath to autodetect this configuration
  • libvirt has to be extended to specify this configuration
  • libvirt has to be extended to request failover, and ack on guest ack of the failover
  • above covers linux guests
    if possible, guest agent for windows should be extended to add this support in windows guests as well


How To Test

Two systems with device assignment (IOMMU) support are required to test this feature. To test the feature, specify an assigned device, start guest and migrate.

XXX: Explicit test steps here for test day


User Experience

User will see that they can specify an assigned network or storage device and still migrate the guest seamlessly.

Dependencies

For networking, https://fedoraproject.org/wiki/Features/NetworkManagerBonding

Contingency Plan

None necessary, revert to previous release behaviour.

Documentation

Links to related upstream documentation:

http://www.linux-kvm.org/page/Hotadd_pci_devices https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/DM_Multipath/index.html http://unixfoo.blogspot.com/2007/10/yet-to-add.html

Release Notes

  • KVM guests with assigned host devices can now be migrated across hosts. The assigned device will be replaced during migration with an emulated device in a transparent manner.

Comments and Discussion