74 research outputs found

    Proactive cloud management for highly heterogeneous multi-cloud infrastructures

    Get PDF
    Various literature studies demonstrated that the cloud computing paradigm can help to improve availability and performance of applications subject to the problem of software anomalies. Indeed, the cloud resource provisioning model enables users to rapidly access new processing resources, even distributed over different geographical regions, that can be promptly used in the case of, e.g., crashes or hangs of running machines, as well as to balance the load in the case of overloaded machines. Nevertheless, managing a complex geographically-distributed cloud deploy could be a complex and time-consuming task. Autonomic Cloud Manager (ACM) Framework is an autonomic framework for supporting proactive management of applications deployed over multiple cloud regions. It uses machine learning models to predict failures of virtual machines and to proactively redirect the load to healthy machines/cloud regions. In this paper, we study different policies to perform efficient proactive load balancing across cloud regions in order to mitigate the effect of software anomalies. These policies use predictions about the mean time to failure of virtual machines. We consider the case of heterogeneous cloud regions, i.e regions with different amount of resources, and we provide an experimental assessment of these policies in the context of ACM Framework

    A Machine Learning-based Framework for Building Application Failure Prediction Models

    Get PDF
    In this paper, we present the Framework for building Failure Prediction Models (F2PM), a Machine Learning-based Framework to build models for predicting the Remaining Time to Failure (RTTF) of applications in the presence of software anomalies. F2PM uses measurements of a number of system features in order to create a knowledge base, which is then used to build prediction models. F2PM is application-independent, i.e. It solely exploits measurements of system-level features. Thus, it can be used in differentiated contexts, without the need for any manual modification or intervention to the running applications. To generate optimized models, F2PM can perform a feature selection to identify, among all the measured system features, which have a major impact in the prediction of the RTTF. This allows to produce different models, which use different set of input features. Generated models can be compared by the user by using a set of metrics produced by F2PM, which are related to the model prediction accuracy, as well as to the model building time. We also present experimental results of a successful application of F2PM, using the standard TPC-W e-commerce benchmark

    Autonomic Rejuvenation of Cloud Applications as a Countermeasure to Software Anomalies

    Get PDF
    Failures in computer systems can be often tracked down to software anomalies of various kinds. In many scenarios, it could be difficult, unfeasible, or unprofitable to carry out extensive debugging activity to spot the causes of anomalies and remove them. In other cases, taking corrective actions may led to undesirable service downtime. In this article we propose an alternative approach to cope with the problem of software anomalies in cloud-based applications, and we present the design of a distributed autonomic framework that implements our approach. It exploits the elastic capabilities of cloud infrastructures, and relies on machine learning models, proactive rejuvenation techniques and a new load balancing approach. By putting together all these elements, we show that it is possible to improve both availability and performance of applications deployed over heterogeneous cloud regions and subject to frequent failures. Overall, our study demonstrates the viability of our approach, thus opening the way towards it adoption, and encouraging further studies and practical experiences to evaluate and improve it

    Machine Learning for Achieving Self-* Properties and Seamless Execution of Applications in the Cloud

    Get PDF
    Software anomalies are recognized as a major problem affecting the performance and availability of many computer systems. Accumulation of anomalies of different nature, such as memory leaks and unterminated threads, may lead the system to both fail or work with suboptimal performance levels. This problem particularly affects web servers, where hosted applications are typically intended to continuously run, thus incrementing the probability, therefore the associated effects, of accumulation of anomalies. Given the unpredictability of occurrence of anomalies, continuous system monitoring would be required to detect possible system failures and/or excessive performance degradation in order to timely start some recovering procedure. In this paper, we present a Machine Learning-based framework for proactive management of client-server applications in the cloud. Through optimized Machine Learning models and continually measuring system features, the framework predicts the remaining time to the occurrence of some unexpected event (system failure, service level agreement violation, etc.) of a virtual machine hosting a server instance of the application. The framework is able to manage virtual machines in the presence of different types anomalies and with different anomaly occurrence patterns. We show the effectiveness of the proposed solution by presenting results of a set of experiments we carried out in the context of a real world-inspired scenario

    Proactive Scalability and Management of Resources in Hybrid Clouds via Machine Learning

    Get PDF
    In this paper, we present a novel framework for supporting the management and optimization of application subject to software anomalies and deployed on large scale cloud architectures, composed of different geographically distributed cloud regions. The framework uses machine learning models for predicting failures caused by accumulation of anomalies. It introduces a novel workload balancing approach and a proactive system scale up/scale down technique. We developed a prototype of the framework and present some experiments for validating the applicability of the proposed approache

    Self-Assembly of Polyhedral Hybrid Colloidal Particles

    Get PDF
    We have developed a new method to produce hybrid particles with polyhedral shapes in very high yield (liter quantities at up to 70% purity) using a combination of emulsion polymerization and inorganic surface chemistry. The procedure has been generalized to create complex geometries, including hybrid line segments, triangles, tetrahedra, octahedra, and more. The optical properties of these particles are tailored for studying their dynamics and self-assembly. For example, we produce systems that consist of index-matched spheres allowing us to define the position of each elementary particle in three-dimensional space. We present some preliminary studies on the self-assembly of these complex shaped systems based on electron and optical microscopy.Engineering and Applied SciencesPhysic

    The DNA Damage Response Pathway Contributes to the Stability of Chromosome III Derivatives Lacking Efficient Replicators

    Get PDF
    In eukaryotic chromosomes, DNA replication initiates at multiple origins. Large inter-origin gaps arise when several adjacent origins fail to fire. Little is known about how cells cope with this situation. We created a derivative of Saccharomyces cerevisiae chromosome III lacking all efficient origins, the 5ORIΔ-ΔR fragment, as a model for chromosomes with large inter-origin gaps. We used this construct in a modified synthetic genetic array screen to identify genes whose products facilitate replication of long inter-origin gaps. Genes identified are enriched in components of the DNA damage and replication stress signaling pathways. Mrc1p is activated by replication stress and mediates transduction of the replication stress signal to downstream proteins; however, the response-defective mrc1AQ allele did not affect 5ORIΔ-ΔR fragment maintenance, indicating that this pathway does not contribute to its stability. Deletions of genes encoding the DNA-damage-specific mediator, Rad9p, and several components shared between the two signaling pathways preferentially destabilized the 5ORIΔ-ΔR fragment, implicating the DNA damage response pathway in its maintenance. We found unexpected differences between contributions of components of the DNA damage response pathway to maintenance of ORIΔ chromosome derivatives and their contributions to DNA repair. Of the effector kinases encoded by RAD53 and CHK1, Chk1p appears to be more important in wild-type cells for reducing chromosomal instability caused by origin depletion, while Rad53p becomes important in the absence of Chk1p. In contrast, RAD53 plays a more important role than CHK1 in cell survival and replication fork stability following treatment with DNA damaging agents and hydroxyurea. Maintenance of ORIΔ chromosomes does not depend on homologous recombination. These observations suggest that a DNA-damage-independent mechanism enhances ORIΔ chromosome stability. Thus, components of the DNA damage response pathway contribute to genome stability, not simply by detecting and responding to DNA template damage, but also by facilitating replication of large inter-origin gaps

    Proactive cloud management for highly heterogeneous multi-cloud infrastructures

    No full text
    Various literature studies demonstrated that the cloud computing paradigm can help to improve availability and performance of applications subject to the problem of software anomalies. Indeed, the cloud resource provisioning model enables users to rapidly access new processing resources, even distributed over different geographical regions, that can be promptly used in the case of, e.g., crashes or hangs of running machines, as well as to balance the load in the case of overloaded machines. Nevertheless, managing a complex geographically-distributed cloud deploy could be a complex and time-consuming task. Autonomic Cloud Manager (ACM) Framework is an autonomic framework for supporting proactive management of applications deployed over multiple cloud regions. It uses machine learning models to predict failures of virtual machines and to proactively redirect the load to healthy machines/cloud regions. In this paper, we study different policies to perform efficient proactive load balancing across cloud regions in order to mitigate the effect of software anomalies. These policies use predictions about the mean time to failure of virtual machines. We consider the case of heterogeneous cloud regions, i.e regions with different amount of resources, and we provide an experimental assessment of these policies in the context of ACM Framework

    Machine learning-based management of cloud applications in hybrid clouds: a hadoop case study

    No full text
    This paper illustrates the effort to integrate a machine learning-based framework which can predict the remaining time to failure of computing nodes with Hadoop applications. This work is part of a larger effort targeting the development of a cloud-oriented autonomic framework to increase the availability of applications subject to software anomalies, and to jointly improve their performance. The framework uses machine-learning, software rejuvenation, and load distribution techniques to proactively prevent failures. We believe that this work allows to set a possible path towards the definition of best practices for the development of systems to support autonomic management of cloud applications, illustrating what are the issues that should be addressed by the research community. Indeed, given the scale and the complexity of modern computing infrastructures, effective autonomic management approaches of cloud applications are becoming mandatory
    corecore