logo

Achievements

In order let others evaluate and use the results of this project, we put together a quick list of the main achievements. Below are brief descriptions including direct links to the GitHub repositories. Almost all of the results are provided as open source.Finally, don’t hesitate to get in direct touch if you or your organization want to use the results for better fault tolerance, etc.

Highly available consolidation of virtualized resources

Regarding consolidation of I/O resources, ORBIT delivers a complete design and implementation of the so called I/O hypervisor, which enables I/O externalization and consolidation. For memory resources, kernel extensions have been designed and implemented resulting to the desired feature of post-copy live migration. The latter has been integrated in Libvirt and QEMU, while OpenStack integration is also provided for the I/O hypervisor.

Achievement 1 – Externalisation and Consolidation of I/O Resources
I/O hypervisor supporting multiple virtual devices
Challenge Traditionally, shared memory, which is highly efficient, is the way guests communicate with the underlying hypervisor. Shared memory is not available anymore once the hypervisor is divided. With Split I/O, the guest communicates with the I/O hypervisor over fast interconnect – high bandwidth and low latency. Optimizing the I/O path to reduce the latency incurred by the network and the added hop was our main challenge.
Proposition A working prototype based on the Linux kernel, supporting multiple guests, each with multiple virtual devices.
Validation Results
  • Paravirtual Remote I/O has been presented in ASPLOS 2016 conference
  • Validation results are also provided in D3.3.3 deliverable.
Release Model The I/O hypervisor kernel code (GPLv2) is available online. For the creation and exposure of virtual devices the scripts are also available online.

Achievement 2 – Kernel user fault page handling
Mechanism for QEMU to detect missing pages
Challenge When pages of memory have not yet been transferred to a host, the guest must pause waiting for that page. Since the migration and control of the guest is performed in userspace it must be provided by the kernel and made accessible by QEMU.
Proposition Modifications to the Linux kernel that provide a mechanism (namely “userfaultfd”) allowing a user process to mark an area of memory for special treatment, and for access to those pages to be notified to that user process.
Validation Results
Release Model Open source (GPLv2) in the main Linux kernel

Achievement 3 – Post copy Live migration
QEMU modifications and mechanism for post copy live migration
Challenge Utilize the kernel user fault page handling mechanism to implement post copy live migration mechanism that achieves migration even in cases where pre copy fails.
Proposition Modifications made to the QEMU migration process to utilise the Linux kernel’s userfaultfd mechanism and request pages as the QEMU process is running.
Validation Results
Release Model Open source (GPLv2 or later) merged in the main QEMU tree

Achievement 4 – Post Copy Live Migration Feasible through Libvirt
Integration of the QEMU post-copy live migration capabilities into LibVirt
Challenge The QEMU hypervisor (that implements the post copy capability) is rarely used on its own, and is normally controlled by another layer (i.e. Libvirt), which needs modifications for use with the added features of post-copy live migration.
Proposition LibVirt live migration has been extended to allow not only pre-copy live migration but also the new post-copy migration. To support that, the live-migration flow has been modified accordingly when post-copy flags are used.
Validation Results ORBIT deliverable D3.3.3
Release Model Open source available at Libvirt official repositories

Achievement 5 – Post Copy Live Migration feasible through OpenStack
Integration of the Libvirt post-copy migration capabilities into OpenStack
Challenge OpenStack manages an entire cloud (as opposed to a single machine that Libvirt and QEMU deal with). To use the new Libvirt post-copy live migration mechanism at OpenStack level modification are required.
Proposition New post-copy live migration mechanism is offered as an extension to the live-migration OpenStack API, allowing the admins to use either pre- or post-copy live migration, based on their need. This new functionality is offered both through the nova API, the nova (python) client and the web interface (Horizon), and is based on an automatic switching to post-copy (if post-copy flag is used) after the first memory copying iteration.
Validation Results ORBIT deliverable D3.3.3
Release Model

Achievement 6 – Externalisation and Consolidation of I/O Resources feasible through OpenStack
Integration of the I/O Hypervisor capabilities into OpenStack
Challenge In order to make use of the I/O Hypervisor functionality at OpenStack level, regarding block devices management, the OpenStack volume attachment flow need to be modified.
Proposition OpenStack Nova has been extended with a new component, named nova-IORCL, in charge of dealing with the I/O Hypervisor integration at OpenStack level. This component is deployed at the I/O Hypervisor servers, and is in charge of attaching the block devices to the I/O Hypervisor (instead of to the VM’s host) and then link them to the VMs.
Validation Results ORBIT deliverable D3.3.3
Release Model Open source available online

Application transparent virtual machine fault tolerance

A software based fault-tolerance system has been developed on the Linux/KVM virtualization platform including tools for continuous state synchronization and memory consistency based on a hybrid COLO / checkpointing approach that has been introduced by ORBIT. Moreover, mechanisms have been delivered enabling the integration of external resources (I/O and memory) in the developed fault-tolerance system, so as to track and transfer state changes. This objective also focuses on APIs to cloud software (i.e. OpenStack) to support the developed fault-tolerance capabilities.

Achievement 7 – Software-based Fault Tolerance
Hybrid fixed and COLO mode checkpointing
Challenge COLO avoids rapid checkpointing in many cases and thus increases guest performance at the cost of using CPU on two systems. On the other hand, simple checkpointing requires rapid checkpointing and has a higher overhead.
Proposition Hybrid mode that switches dynamically between COLO checkpointing and a simple fixed length checkpoint based on the behaviour of the workload.
Validation Results
  • ORBIT deliverable D4.3.2
  • Validation results are also provided in D3.3.3 deliverable.
Release Model Open source (GPLv2 or above) available on Github

Achievement 8 – High Performance Fault Tolerance
RDMA transport for checkpoint stream
Challenge Rapid checkpointing, or checkpointing of busy VMs that rapidly change memory can be slowed down by the time taken to transfer the checkpoint.
Proposition An RDMA transport (instead of TCP) has been integrated into the COLO checkpointing system to carry the checkpoint data.
Validation Results
  • ORBIT deliverable D4.3.2
  • Validation results are also provided in D3.3.3 deliverable.
Release Model Open source (GPLv2 or above) available on Github

Achievement 9 – Fault detection for fault-tolerant VM pairs
Detection of failure in VM pairs
Challenge A robust fault-detection mechanism is required to detect failures in the pair of VMs that are in fault-tolerant mode.
Proposition A distributed fault detection mechanism is used that detects failures of either of the VMs in the fault-tolerant pair. This information is communicated back through the OpenStack layer that triggers a fail-over.
Validation Results
  • ORBIT deliverable D4.3.2
  • Validation results are also provided in D3.3.3 deliverable.
Release Model Open source (GPLv2 or above) available on Github:

Achievement 10 – Fault Tolerance feasible through Libvirt
Integration of the fault-tolerance features into a modified Libvirt layer
Challenge The QEMU hypervisor (that implements the checkpointing) is rarely used on its own, and is normally controlled by another layer (i.e. Libvirt), which needs modifying for use with the added features of fault-tolerance.
Proposition Libvirt modifications have been added to start the fault-tolerant machines, gather statistics and perform the failover.
Validation Results
  • ORBIT deliverable D4.3.2
  • Validation results are also provided in D3.3.3 deliverable.
Release Model Open source (GPLv2 or above) available on Github

Achievement 11 – Fault Tolerance feasible through OpenStack
Integration of the fault-tolerance features and the use of I/O Hypervisor on fault-tolerance VM pairs integrated into OpenStack
Challenge OpenStack manages an entire cloud (as opposed to a single machine that Libvirt and QEMU deal with) and modifications are required to schedule the two VMs, allow them both to run together (even though they’re actually running the same guest), and interface with the fault detection code. Moreover and given that the I/O Hypervisor talks to the guest directly over a modified network protocol, working with a fault-tolerant pair requires care with the protocol.
Proposition OpenStack modification to perform all the setup and management of the fault-tolerant pair and handle the full life cycle of operation including failover. Operation of a fault-tolerant pair is available to the user via OpenStack’s standard management interface. Modifications to the I/O Hypervisor protocol to allow them to work with the COLO packet comparison code and Linux firewalling.
Validation Results
  • ORBIT deliverable D4.3.2
  • Validation results are also provided in D3.3.3 deliverable.
Release Model Open source available online for different OpenStack components:

Metro-area zero downtime disaster recovery

ORBIT surveyed technologies that build MANs and their characteristics (i.e. distance, latency, and bandwidth), considering also the restoration of MAN connectivity – in case of a network failure, while also studying the characteristics of the MANs in terms of the applications running on them and in terms of the traffic they generate. The goal of this analysis was to analyse of what data and metadata is needed in order to resume a workload in another cloud, providing a user perception of near-zero downtime. It was identified that providing disaster recovery using the same techniques as for LAN based fault tolerance is not possible for large scale deployments of virtual machines. Hence, to provide a practical solution (as opposed to a theoretical one), a solution has been developed for near-zero downtime through data and state synchronization across data centers. The end-to-end solution has been integrated with OpenStack to allow “protection” and “recovery” actions to be initiated from the OpenStack dashboard (Horizon). The implemented solution supports recovery of VMs, storage devices, and network connectivity.

Achievement 12 – End-to-end Disaster Recovery
Disaster recovery solution across data centers
Challenge A job may consist of a number of interacting VMs, their data volumes, and network topology. These need to be recreated in a secondary data centre, with minimum down time and data loss, in case of a failure in the primary data centre.
Proposition End-to-end scenario of recovery of VMs with volume snapshot and with volume asynchronous replication plus network topology, including automatic fault-detection and re-routing of network traffic.
Validation Results
  • ORBIT deliverable D5.3.2
  • Validation results are also provided in D3.3.3 deliverable.
Release Model Open source (Apache 2.0 license) available online for different components of the solution:

Achievement 13 – Protection of VMs and Data
Protection of a primary datacenter VMs and data into a backup site
Challenge Implementation of protection mechanisms that reduce the amount of data that needs to be sent and/or the amount of data that may be loss in case of a sudden datacenter failure.
Proposition Mechanisms to protect, both VMs and volumes including basic snapshot-based drivers and more advance techniques that allow reducing the impact of the protection actions. For instance, image-copy avoids snapshot interference on running VMs, and minimizes the amount of network required by the protection action in subsequent calls. Similarly, volume-replication minimizes the data loss in case of a failure, and the time needed to recreate the volumes during recovery.
Validation Results
  • ORBIT deliverable D5.3.2
  • Validation results are also provided in D3.3.3 deliverable.
Release Model Open source (Apache 2.0 license) available online for different components of the solution:

Achievement 14 – Recovery of VMs and Data
Recovery of a primary datacenter VMs and data into a backup site
Challenge In order to recover a group of VMs and volumes, it is not only needed to have the data to recover from, but also to create and connect them in the same way they were in the primary datacenter (i.e. also including information about network configurations, as well as relationship between the different components).
Proposition Mechanism to enable the recreation of VMs, volumes, networks and other configurations (such as relations between VMs, volumes and networks) in a backup site based on the available (protected) information
Validation Results
  • ORBIT deliverable D5.3.2
  • Validation results are also provided in D3.3.3 deliverable.
Release Model Open source (Apache 2.0 license) available online for different components of the solution:

Achievement 15 – Disaster Detection
Robust datacenter disaster detection in a geographically distributed environment
Challenge A robust disaster-detection mechanism is required to detect failures that affect a complete datacenter and trigger the needed recovery actions.
Proposition A distributed disaster-detection mechanism is used that detects failures of complete datacenters in a reliable way (i.e. avoiding false-positives, as well as being fast enough). This information is communicated to the DR-Orchestration component through OpenStack new APIs that triggers the required recovery actions.
Validation Results
  • ORBIT deliverable D5.3.2
  • Validation results are also provided in D3.3.3 deliverable.
Release Model Open source (Apache 2.0 license) available online for different components of the solution:

Achievement 16 – Disaster Recovery feasible through OpenStack
Automatization and optimization of protection and recovery actions
Challenge To recover a complete datacenter it is required to send information from the primary to the backup datacenter every now and them, performed by the previous protection mechanisms. However, there is a trade-off between the frequency of the updates and the impact that the replication data may have into the normal datacenter operation.
Proposition A new component (namely DR-Orchestration) integrated in OpenStack to automatize, simplify and optimize the protection and recovery actions when using the DR-Engine. This component is in charge of:

  1. Deciding when new information needs to be replicated in the backup site (i.e. new call to DR-Engine protection API)
  2. Making it easier for the user to protect their VMs/data, by just having to issue 1 command/click
  3. Minimize the replication flows impact into the normal datacenter operation by a control-theory based network bandwidth allocation solution
  4. Offering an API for the recovery of the protected policy in the backup site

Moreover and since volume replication is based on DRBD functionality to provide the low level required replication capabilities, it has been integrated at OpenStack so that volumes can be created by Cinder and replicated at the backup site. The DRBD driver manager has been updated to adapt it to the OpenStack JUNO release.

Validation Results ORBIT deliverable D5.3.2
Release Model Open source (Apache 2.0 license) available online for different components of the solution:

Achievement 17 – Flexible Data Dashboard To Report Downtime and other KPIs
Creation of a stand-alone, customizable dashboard for downtime monitoring
Challenge Provide a simple way to track and monitor IT availability, including measurement of newly introduced modules.
Proposition Provide a new way to report the availability of IT resources (occurrence, length and severity of outages vs. availability for a variety of user groups beyond technical experts/admins.
Validation Results ORBIT deliverable D6.3.3
Release Model Open source available online