11 research outputs found

    OpenSAF and VMware from the Perspective of High Availability

    Get PDF
    Cloud is becoming one of the most popular means of delivering computational services to users who demand services with higher availability. Virtualization is one of the key enablers of the cloud infrastructure. Availability of the virtual machines along with the availability of the hosted software components are the fundamental ingredients for achieving highly available services in the cloud. There are some availability solutions developed by virtualization vendors like VMware HA and VMware FT. At the same time the SAForum specifications and OpenSAF as a compliant implementation offer a standard based open solution for service high availability. Our work aims at comparing virtualization solutions, VMware, with OpenSAF from the high availability perspective, and proposes appropriate combinations to take advantage of the strengths of both solutions. To conduct our evaluations, we established metrics, selected a video streaming application and conducted experiments on different architectures covering OpenSAF in physical and virtual machines, the VMware HA and VMware FT. Based on the analysis of the initial measurements, we proposed other architectures that combine OpenSAF high availability and the virtualization provided by VMware. Our proposal included architectures targeting two types of hypervisors, non-bare-metal and bare-metal. In both of these proposed architectures we used OpenSAF to manage the availability of the VM and the case study application running in the VM. The management of the availability of the VM is slightly different in these architectures because of the types of the hypervisors. In these architectures we used the libraries and the mechanisms which are available in many other hypervisors. Our work compared to other works on high availability in virtual environments has the important advantage of covering the application/service failure

    Elastic Highly Available Cloud Computing

    Get PDF
    High availability and elasticity are two the cloud computing services technical features. Elasticity is a key feature of cloud computing where provisioning of resources is closely tied to the runtime demand. High availability assure that cloud applications are resilient to failures. Existing cloud solutions focus on providing both features at the level of the virtual resource through virtual machines by managing their restart, addition, and removal as needed. These existing solutions map applications to a specific design, which is not suitable for many applications especially virtualized telecommunication applications that are required to meet carrier grade standards. Carrier grade applications typically rely on the underlying platform to manage their availability by monitoring heartbeats, executing recoveries, and attempting repairs to bring the system back to normal. Migrating such applications to the cloud can be particularly challenging, especially if the elasticity policies target the application only, without considering the underlying platform contributing to its high availability (HA). In this thesis, a Network Function Virtualization (NFV) framework is introduced; the challenges and requirements of its use in mobile networks are discussed. In particular, an architecture for NFV framework entities in the virtual environment is proposed. In order to reduce signaling traffic congestion and achieve better performance, a criterion to bundle multiple functions of virtualized evolved packet-core in a single physical device or a group of adjacent devices is proposed. The analysis shows that the proposed grouping can reduce the network control traffic by 70 percent. Moreover, a comprehensive framework for the elasticity of highly available applications that considers the elastic deployment of the platform and the HA placement of the application’s components is proposed. The approach is applied to an internet protocol multimedia subsystem (IMS) application and demonstrate how, within a matter of seconds, the IMS application can be scaled up while maintaining its HA status

    Openicra : vers un modèle générique de déploiement automatisé des applications dans le nuage informatique

    Get PDF
    Le nuage informatique ou le « Cloud Computing » est une approche informatique qui fait référence à la mise à disposition de ressources informatiques à la demande via Internet ou en privé via le réseau interne de l’entreprise, avec un modèle de facturation à l’usage. Le paradigme IaaS (Infrastructure as a Service) consiste à fournir aux utilisateurs un accès, à la demande et en libre-service, à un parc informatique virtualisé, souvent composé de machines virtuelles sur lesquelles les utilisateurs peuvent installer, contrôler et personnaliser leurs applications. Alternativement, le modèle de service PaaS (Platform as a Service) offre aux utilisateurs un environnement de programmation disponible immédiatement et entièrement gérable pour créer et déployer des applications évolutives dans le nuage informatique et sans intervention de l’utilisateur. Bien que le choix d’un tel environnement puisse paraitre assez avantageux, plusieurs défis se lèvent devant l’utilisation efficace des systèmes du nuage informatique. Ainsi, les services du nuage informatique sont offerts à différents niveaux d’abstraction où les fournisseurs de nuage exposent l’accès à leurs services via des APIs propriétaires. Cela encourage l’enfermement propriétaire et limite l’interopérabilité des services de nuage informatique, ce qui constitue une barrière à l'entrée significative pour les utilisateurs de nuage. Plusieurs solutions ont été proposées en se basant sur l'utilisation des couches intermédiaires qui permettent d’isoler les applications de la variabilité de certains services offerts par les fournisseurs de nuages. Toutefois, ces approches constituent une solution partielle de ce problème lorsque de telles solutions utilisent certaines technologies propriétaires, car elles contribuent au risque de déplacer l'effet de l’enfermement propriétaire du fournisseur vers les outils de déploiement. L’objectif principal de notre recherche consiste à concevoir et à développer un nouveau modèle générique de déploiement automatique des applications dans le nuage informatique, afin d'atténuer les effets de ces barrières à l'entrée, de réduire la complexité de développement des applications et de simplifier le processus de déploiement des services dans le nuage. De surcroît, supporter et déployer automatiquement des applications sur le nuage en assurant l’élasticité, la mise à l’échelle automatique et l’interopérabilité avec toutes les plateformes et en optimisant la gestion du stockage sont les objectifs primordiaux de ce mémoire. Notre modèle proposé OpenICRA met en oeuvre une architecture en couches qui cache les détails de l'implémentation permettant d’avoir un processus de déploiement simple. En outre, contrairement aux autres solutions de nuages informatiques disponibles telles que Google App Engine, Windows Azure ou Amazon Elastic Beanstalk, les composants du modèle proposé se caractérisent par la liberté d’accès à leur code source. Ceci nous permet de garantir la portabilité des applications dans tout environnement d’exécution, d’éviter l’enfermement propriétaire et de faciliter le processus d’automatisation des applications dans le nuage. Les méthodes de redondance, de mise à l’échelle et l’intégration du système de fichiers distribué avec la couche IaaS permettent d’assurer la haute disponibilité, l’évolutivité et l’extensibilité du modèle et des applications ainsi que l’optimisation de la gestion de stockage des disques durs virtuels des VMs dans tout environnement du nuage. Nous avons réalisé deux cas d’études réels pour valider le modèle OpenICRA dont le premier consiste à automatiser le déploiement de l’intergiciel distribué OpenSAF dans un cluster de noeuds au sein de l’environnement Cloud du réseau GSN (Synchromedia, 2010), alors que le deuxième cas consiste à migrer l’application de travail collaboratif ICRA (Cheriet, 2012) vers le nuage EC2 d’Amazon (2012a). Nos résultats empiriques démontrent l’efficacité de notre modèle proposé pour déployer différents types d’applications sans apporter aucune modification dans leurs codes sources. En outre, nous démontrons comment notre proposition est capable d’automatiser et d’orchestrer le processus de déploiement des applications et d’optimiser leur exécution en fonction de la performance dans des environnements de nuage informatique hétérogènes

    Monitoring Service Level Workload of Highly Available Applications

    Get PDF
    Elasticity is a key feature of cloud computation and is a major contributor to its popularity. Elasticity is defined as automatic provisioning/de-provisioning of resources to match workload changes over time. Service High Availability (HA) is among one of cloud computing’s big challenges. High Availability (HA) is defined as providing a minimum of 99.999% service availability. Maintaining service HA while scaling in/out is even more challenging. Recently, an architecture has been proposed for managing HA. Following the proposed architecture, an Elasticity Engine has been introduced that is capable of managing resources based on application level provisioning or de-provisioning alerts while preserving HA. In contrast to the prevailing monitoring solutions where Virtual Machine (VM) level workload is provided, the Elasticity Engine requires a monitoring solution that monitors service-level workload and triggers alerts accordingly. In this thesis, we propose an approach and an architecture for the monitoring of HA applications at the service level. Accordingly, the monitoring approach starts with monitoring the application components in traditional manner. Workload of the components are mapped to each component’s respective service assignment. The resource usages of all the components providing services is aggregated and mapped to the service level workload using a distributed client-server architecture. This approach allows for distinguishing between the different HA states, active or standby that a component can be assigned at runtime and it (the approach) adapts to the situations where switchovers happen under the control of the SA Forum middleware due to failures for example. The proposed monitoring architecture has been implemented and integrated with the Elasticity Engine to test its effectiveness and overhead. It has been shown that the implemented and integrated prototypes achieve elasticity in a cluster based on service level workload while keeping the monitoring overhead within 5% of its total resource

    Design and Deployment of AMF Configurations in the Cloud

    Get PDF
    With the ever growing popularity of cloud computing, the trend of deploying applications in the cloud is increasing more than ever. Cloud offers computing resources that can be provisioned as required and scaled according to the workload demand. This feature attracts service providers to deploy their applications in the cloud. As users continue to rely more on the services provided by these applications, it is essential to keep the applications running with minimal service outage. Service Availability Forum (SA Forum) has defined a framework called Availability Management Framework (AMF) which can be used to manage service availability. AMF is agnostic to the services provided by the applications. However, it manages the service availability of applications by orchestrating the redundant entities through a configuration called AMF configuration. The design of AMF configurations for a physical cluster based on the functional and non-functional requirements, such as minimum level of service availability, has been proposed in the literature. In these solutions, the number of physical hosts required to deploy an application is given as input and the resource utilization is not taken into consideration. However, for deploying applications in the cloud the number of physical hosts is not fixed and should vary depending on the workload. Therefore, the issue of minimizing the number of physical hosts while meeting the requested level of service availability arises. In particular, the service availability depends not only on the entities involved in providing the service but also on the interferences caused by the collocation of entities. To minimize these interferences, the collocated entities can be grouped into fault isolation units such as VMs. This in turn may increase the number of resources required. In this thesis, an approach to generate AMF configuration for the cloud is proposed. In this approach, a novel method is used to calculate the number of AMF entities that meets the availability and resource utilization requirements. In addition, a method to estimate service availability is proposed. It aims to predict the availability of service by considering the potential factors that affect availability, including the interferences due to collocation. Furthermore, an approach to deploy AMF applications in the cloud is proposed. As a proof of concept, a prototype that demonstrates the generation and deployment of AMF configurations in an OpenStack cloud has been developed. This prototype includes the existing Monitoring and Elasticity Engine, previously developed in the MAGIC project

    Upgrade of lower layers in a High Availability environment

    Get PDF
    Various industries such as telecommunication, banking etc. require un-interrupted services throughout the year. The requirement takes into account the system maintenance and the upgrade operations. The Service Availability Forum (SA Forum) solution enables high availability of services even during the maintenance and the upgrade operation. This solution enables portability of application across various platforms. The SA Forum defined a service - Software Management Framework (SMF) that orchestrates the upgrade of SA Forum managed system. To perform an upgrade SMF requires an upgrade campaign. The solutions proposed in SMF are applicable only for the application layer but not for the lower layers such as Operating Systems and the virtualization facilities which include Virtual Machines and Virtual Machine Managers. On the other hand, the work done previously within the MAGIC project for the automatic generation of an upgrade campaign is limited to application entities only. The objective of this thesis is to propose solutions in the context of SMF for the upgrade of lower layers as well without impacting the availability of services. To accomplish this objective we proposed three new upgrade steps that properly handle the dependencies between the layers of a machine during the upgrade. We also devised an approach for the automatic generation of an upgrade campaign for lower layers. The extended SMF is capable of executing the generated upgrade campaign for upgrading the virtualization facilities which include VMs capable of live migration as well. The upgrade campaign generation approach has been implemented in a prototype tool as an eclipse plug-in and tested with a case study

    Managing High-Availability and Elasticity in a Cluster Environment

    Get PDF
    Cloud computing is becoming popular in the computing industry. Elasticity and availability are two features often associated with cloud computing. Elasticity is defined as the automatic provisioning of resources when needed and de-provisioning of resources when they are not needed. Cloud computing offers the users the option of only paying for what they use and guarantees the availability of virtual infrastructure (i.e. virtual machines). The existing cloud solutions handle both elasticity and availability at the virtual infrastructure level through the manipulation, restart, addition and removal of virtual machines (VMs) as required. These solutions equate the application and its workload to the VMs that run the application. High-availability applications are typically composed of redundant resources, and recover from failures through failover mostly managed by a middleware. For such applications, handling elasticity at the virtual infrastructure level through the addition and removal of VMs is not enough, as the availability management in application level will not make use of additional resources. This requires new solutions that manage both elasticity and availability in application level. In this thesis, we provide a solution to manage the elasticity and availability of applications based on a standard middleware defined by the Service Availability Forum (SA Forum). Our solution manages application level elasticity through the manipulation of the application configuration used by the middleware to ensure service availability. For this purpose we introduce a third party, ‘Elasticity Engine’ (EE), that manipulates the application configuration used by the SA Forum middleware when a workload changes. This in turn triggers the SA Forum middleware to change the workload distribution in the system while ensuring service availability. We explore the SA Forum middleware configuration attributes that play a role in elasticity management, the constraints applicable to them, as well as their impact on the load distribution. We propose an overall architecture for availability and elasticity management for an SA Forum system. We design the EE architecture and behavior through a set of strategies for elasticity. The proposed EE has been implemented and tested

    Bayesian Prognostic Framework for High-Availability Clusters

    Get PDF
    Critical services from domains as diverse as finance, manufacturing and healthcare are often delivered by complex enterprise applications (EAs). High-availability clusters (HACs) are software-managed IT infrastructures that enable these EAs to operate with minimum downtime. To that end, HACs monitor the health of EA layers (e.g., application servers and databases) and resources (i.e., components), and attempt to reinitialise or restart failed resources swiftly. When this is unsuccessful, HACs try to failover (i.e., relocate) the resource group to which the failed resource belongs to another server. If the resource group failover is also unsuccessful, or when a system-wide critical failure occurs, HACs initiate a complete system failover. Despite the availability of multiple commercial and open-source HAC solutions, these HACs (i) disregard important sources of historical and runtime information, and (ii) have limited reasoning capabilities. Therefore, they may conservatively perform unnecessary resource group or system failovers or delay justified failovers for longer than necessary. This thesis introduces the first HAC taxonomy, uses it to carry out an extensive survey of current HAC solutions, and develops a novel Bayesian prognostic (BP) framework that addresses the significant HAC limitations that are mentioned above and are identified by the survey. The BP framework comprises four \emph{modules}. The first module is a technique for modelling high availability using a combination of established and new HAC characteristics. The second is a suite of methods for obtaining and maintaining the information required by the other modules. The third is a HAC-independent Bayesian decision network (BDN) that predicts whether resource failures can be managed locally (i.e., without failovers). The fourth is a method for constructing a HAC-specific Bayesian network for the fast prediction of resource group and system failures. Used together, these modules reduce the downtime of HAC-protected EAs significantly. The experiments presented in this thesis show that the BP framework can deliver downtimes between 5.5 and 7.9 times smaller than those obtained with an established open-source HAC

    Automating the Upgrade of IaaS Cloud Systems

    Get PDF
    The different resources providing an Infrastructure as a Service (IaaS) cloud service may need to be upgraded several times throughout their life-cycle for different reasons, for instance to fix discovered bugs, to add new features, or to fix a security threat. An IaaS cloud provider is committed to each tenant by a service level agreement (SLA) which indicates the terms of commitment, e.g. the level of availability, that have to be respected even during upgrades. However, the service delivered by the IaaS cloud provider may be affected during the up-grade. Subsequently, this may violate the SLA, which in turn will impact other services rely-ing on the IaaS. Our goal in this thesis is to devise an approach and a framework for automat-ing the upgrade of IaaS cloud systems with minimal impact on the services and with respect to the SLAs. The upgrade of IaaS cloud systems under availability constraints inherits all the challenges of the upgrade of traditional clustered systems and faces other cloud specific challenges. Similar challenges as in clustered systems include the potential dependencies between resources, po-tential incompatibilities along dependencies during the upgrade, potential system configura-tion inconsistencies due to the upgrade failures and the minimization of the amount of used resources to complete the upgrade. Dependencies of the application layer on the IaaS layer is an added challenge that must be handled properly. In addition, the dynamic nature of the cloud environment poses a new challenge. A cloud system evolves, even during the upgrade, according to the workload changes by scaling in/out. This mechanism (referred to as autoscal-ing) may interfere with the upgrade process in different ways. In this thesis, we define an upgrade management framework for the upgrade of IaaS cloud systems under SLA constraints. This framework addresses all the aforementioned challenges in an integrated manner. The proposed framework automatically upgrades an IaaS cloud sys-tem from a current configuration to a desired one, according to the upgrade requests specified by the administrator. It consists of two distinct components, one to coordinate the upgrade, and the other one to execute the necessary upgrade actions on the infrastructure resources. For the coordination of the upgrade process, we propose a new approach to automatically identify and schedule the appropriate upgrade methods and actions for implementing the up-grade requests in an iterative manner taking into account the vendors’ descriptions of the in-frastructure components, the SLAs with the tenants, and the status of the system. This ap-proach is also capable of handling new upgrade requests even during ongoing upgrades, which makes it suitable for continuous delivery. In case of failures, the proposed approach automatically issues localized retry and undo recovery operations as appropriate for the failed upgrade actions to preserve the consistency of the system configuration. In this thesis, to demonstrate the feasibility of the proposed upgrade management framework we present a proof of concept (PoC) for the upgrade IaaS compute, and its application in an OpenStack cluster. In this PoC, we target the new challenge of upgrade of the IaaS cloud (i.e. unexpected interference between the autoscaling and the upgrade processes) compared to the clustered systems. In addition, the prototype of the proposed upgrade approach for coordinat-ing the upgrade of all kinds of IaaS resources has been implemented and discussed in this thesis. We also provide an informal validation and a rigorous analysis of the main properties of our approach. In addition, we conduct experiments to evaluate our approach with respect to SLA constraints of availability and elasticity. The results show that our approach avoids the outage at the application level and reduces SLA violations during the upgrade, compared to the traditional upgrade method used by cloud providers
    corecore