logo

Project Goals and Demos: General overview

Context

This page provides a comprehensive and brief overview. For specific results of the ORBIT project and how to use the open source, please refer to the respective pages.

ORBIT delivers an enhanced cloud infrastructure,  integrating Fault Tolerance and Disaster Recovery capabilities to effectively address the needs of mission-critical services. To this end, the main project objectives of the ORBIT project were four key goals:

  • Highly Available Consolidation of Virtualized Resources

  • Metro-Area Zero Downtime Disaster Recovery

  • Application Transparent Virtual Machine Fault Tolerance

  • Flexible Status Monitoring for non-technical audiences

 

Highly Available Consolidation of Virtualized Resources

Building atop the concept of server virtualization to enable guest VMs to consume remote memory and I/O resources in a consolidated manner. Advances in network fabrics technology have brought significant reduction in network latencies thus giving rise to a fundamental innovation introduced by ORBIT: the consolidation of guest Virtual Machine (VM) resources (such as memory as well as various I/O resources), which rather than being provided by the host’s local resources are provided by dedicated remote hosts, accessible via modern low-latency networks.

But how does this new paradigm of virtualized resources consolidation aid in the implementation of High Availability (HA) solutions? Existing experience with virtualization based Fault Tolerance implementations (such as the solutions provided by Kemari and Remus) incur major performance penalties due to VM check-pointing and state synchronization of the passive VM in an active-passive topology; while the alternative approach of lockstep processing in a primary-secondary topology offered by VMWare has no support for Symmetric Multi-Processing (SMP) guest VMs.

One way of reducing the cost of Kemari’s state synchronization, without impairing the generality of the solution, is to reduce, by means of virtualized resource consolidation, part of the VM state. Externalization (from the perspective of the guest VM) and consolidation of virtualized I/O is carried out by transferring I/O operations requested by the VMs to a dedicated remote server responsible for executing the requests. Beyond general advantage of consolidation (such as better hardware utilization) this architecture enables moving part of the memory used as disk cache to the remote server. Externalizing and consolidating memory of guest virtual machines consists of storing memory contents in the physical memory of remote machines, and retrieving it upon access. For externalizing memory of guest virtual machines the goal is to extend the Linux kernel with user-space page fault support, which can be exploited to implement a networked accessible memory pool.

 

Metro-Area Zero Downtime Disaster Recovery

To improve business continuity by distributed resources geographically over a Metropolitan Area Network, whilst maintaining the desired KPIs of instantaneous fail-over with near-zero downtime, supporting business continuity even in light of major faults downing an entire site. ORBIT developed the technologies necessary to builds a cloud infrastructure for delivering disaster recovery services. In addition to maintaining the cloud spirit (e.g., virtualisation, pay-per-use, scalability, etc.), ORBIT addresses aspects of maintaining high-performance of the running services, efficient consumption of inter-site network resources, fault-detection with automatic and semi-automatic means for fail-over triggering and elaborates means for handling seamless traffic redirection.lidation of virtualized I/O is carried out by transferring I/O operations requested by the VMs to a dedicated remote server responsible for executing the requests. Beyond general advantage of consolidation (such as better hardware utilization) this architecture enables moving part of the memory used as disk cache to the remote server. Externalizing and consolidating memory of guest virtual machines consists of storing memory contents in the physical memory of remote machines, and retrieving it upon access. For externalizing memory of guest virtual machines the goal is to extend the Linux kernel with user-space page fault support, which can be exploited to implement a networked accessible memory pool.

Application Transparent Virtual Machine Fault Tolerance

Improving the current solutions, which are either application specific, limited to Uniprocessor workloads, lack required performance targets, or require propriety hardware, by providing a software-only solution that can be widely deployed on commodity hardware. None of the existing FT offerings is suitable for wide-cloud deployment: some requiring dedicated hardware; others lacking in performance generality. And in the absence of cloud worthy FT solutions, users cannot move their mission-critical services into the cloud premise, impairing wider adoption of the cloud. The capabilities of FT are also fundamental for addressing ICT evolution into more cost effective platforms. ORBIT addresses the fundamental technical barriers for VM FT and develops new technologies to overcome these barriers. ORBIT improves existing software solutions in reduced performance overhead, while it also provides application-agnostic FT, as its design approach does not impair the generality of VM live migration, which serves as a foundation for handling VM state synchronization. Moreover, ORBIT supports for commodity hardware as the approach is a software state synchronization solution and has no specific requirement for specialized hardware in contrast to products such as Stratus ftServer that requires propriety hardware to handle the synchronization of non-deterministic I/O events such as: clock, board temperature, fan speed, etc.

 

Flexible Status Monitoring for non-technical audiences

We already have tools to monitor the availability of an IT set-up. Commonly used solutions are Nagios/Incinga and a range of new, in-browser solutions. The goal of these applications is to monitor by the second – resulting in a wealth of information. The goal is clear: Avoid or detect outages, performance and other issues for a technical audience.

But what about the many more stakeholders in any organization depending on available IT resources? There is hardly any company that can perform it’s tasks when clouds or any other critical system is down. The effects of outages range from inconvenience (not being able to access a certain resource) to financially severe losses (e.g. a ticket system of an airline) to life-threatening security (e.g. an atomic plan spinning out of control).

The thinking behind the flexible dashboard created in the ORBIT project was to create a simple layer of aggregated information, which would cater to the non-technical audiences of any organization. The design principle was to simplify set-up and updates, make it possible for anybody to collect, display and communicate critical key performance indicators.

Technically the system was built using AngularJS and simple „widgets“ able to collect and display (mostly aggregated) numbers for better information. Ideally the information gathered can be displayed on big screens to provide quick and easy to digest information at a glance.

Creating Widgets on the Fly


More about the dasbhoard on the specific page for this outcome.