460 research outputs found

    Recovery for sporadic operations on cloud applications

    Full text link
    Cloud-based systems get changed more frequently than traditional systems. These frequent changes involve sporadic operations such as installation and upgrade. Sporadic operations on cloud manipulate cloud resources and they are prone to unpredictable and inevitable failures largely due to cloud uncertainty. To recover from failures in sporadic operations on cloud, we need cloud operational recovery strategies. Existing operational recovery methods on cloud have several drawbacks, such as poor generalizability of the exception handling mechanism and the coarse-grained recovery manner of rollback mechanisms. Hence, this thesis proposes a novel and innovative recovery approach, called POD-Recovery, for sporadic operations on cloud. One novelty of POD-Recovery is that it is based on eight cloud operational recovery requirements formulated by us (e.g. recovery time objective satisfaction and recovery generalizability). Another novelty of POD-Recovery is that it is non-intrusive and does not modify the code which implements the sporadic operation. POD-Recovery works in the following innovative way: it first treats a sporadic operation as a process which provides the workflow of the operation and the contextual information for each operational step. Then, it identifies the recovery points (where failure detection and recovery should be performed) inside the sporadic operation, determines the unified resource space (the resource types required and manipulated by the sporadic operation), and generates the expected resource state templates (the abstraction level of resource states) for all operational steps. For a given recovery point inside the sporadic operation, POD-Recovery first filters the applicable recovery patterns from the eight recovery patterns it supports and then automatically generates the recovery actions for the applicable recovery patterns. Next, it evaluates the generated applicable recovery actions based on the metrics of Recovery Time, Recovery Cost and Recovery Impact. This quantitative evaluation leads to the selection of an acceptable recovery action for execution for a given recovery point. We implement POD-Recovery and evaluate it by recovering from faults injected into five representative types of sporadic operations on cloud. The experimental results show that POD-Recovery is able to perform operational recovery while satisfying all the recovery requirements and it improves on the existing recovery methods for cloud operations

    Technical Report on Deploying a highly secured OpenStack Cloud Infrastructure using BradStack as a Case Study

    Full text link
    Cloud computing has emerged as a popular paradigm and an attractive model for providing a reliable distributed computing model.it is increasing attracting huge attention both in academic research and industrial initiatives. Cloud deployments are paramount for institution and organizations of all scales. The availability of a flexible, free open source cloud platform designed with no propriety software and the ability of its integration with legacy systems and third-party applications are fundamental. Open stack is a free and opensource software released under the terms of Apache license with a fragmented and distributed architecture making it highly flexible. This project was initiated and aimed at designing a secured cloud infrastructure called BradStack, which is built on OpenStack in the Computing Laboratory at the University of Bradford. In this report, we present and discuss the steps required in deploying a secured BradStack Multi-node cloud infrastructure and conducting Penetration testing on OpenStack Services to validate the effectiveness of the security controls on the BradStack platform. This report serves as a practical guideline, focusing on security and practical infrastructure related issues. It also serves as a reference for institutions looking at the possibilities of implementing a secured cloud solution.Comment: 38 pages, 19 figures

    CloudOps: Towards the Operationalization of the Cloud Continuum: Concepts, Challenges and a Reference Framework

    Get PDF
    The current trend of developing highly distributed, context aware, heterogeneous computing intense and data-sensitive applications is changing the boundaries of cloud computing. Encouraged by the growing IoT paradigm and with flexible edge devices available, an ecosystem of a combination of resources, ranging from high density compute and storage to very lightweight embedded computers running on batteries or solar power, is available for DevOps teams from what is known as the Cloud Continuum. In this dynamic context, manageability is key, as well as controlled operations and resources monitoring for handling anomalies. Unfortunately, the operation and management of such heterogeneous computing environments (including edge, cloud and network services) is complex and operators face challenges such as the continuous optimization and autonomous (re-)deployment of context-aware stateless and stateful applications where, however, they must ensure service continuity while anticipating potential failures in the underlying infrastructure. In this paper, we propose a novel CloudOps workflow (extending the traditional DevOps pipeline), proposing techniques and methods for applications’ operators to fully embrace the possibilities of the Cloud Continuum. Our approach will support DevOps teams in the operationalization of the Cloud Continuum. Secondly, we provide an extensive explanation of the scope, possibilities and future of the CloudOps.This research was funded by the European project PIACERE (Horizon 2020 Research and Innovation Programme, under grant agreement No. 101000162)

    CloudOps: Towards the Operationalization of the Cloud Continuum: Concepts, Challenges and a Reference Framework

    Get PDF
    The current trend of developing highly distributed, context aware, heterogeneous computing intense and data-sensitive applications is changing the boundaries of cloud computing. Encouraged by the growing IoT paradigm and with flexible edge devices available, an ecosystem of a combination of resources, ranging from high density compute and storage to very lightweight embedded computers running on batteries or solar power, is available for DevOps teams from what is known as the Cloud Continuum. In this dynamic context, manageability is key, as well as controlled operations and resources monitoring for handling anomalies. Unfortunately, the operation and management of such heterogeneous computing environments (including edge, cloud and network services) is complex and operators face challenges such as the continuous optimization and autonomous (re-)deployment of context-aware stateless and stateful applications where, however, they must ensure service continuity while anticipating potential failures in the underlying infrastructure. In this paper, we propose a novel CloudOps workflow (extending the traditional DevOps pipeline), proposing techniques and methods for applications’ operators to fully embrace the possibilities of the Cloud Continuum. Our approach will support DevOps teams in the operationalization of the Cloud Continuum. Secondly, we provide an extensive explanation of the scope, possibilities and future of the CloudOps.This research was funded by the European project PIACERE (Horizon 2020 Research and Innovation Programme, under grant agreement No. 101000162)

    Software Technologies - 8th International Joint Conference, ICSOFT 2013 : Revised Selected Papers

    Get PDF

    A Runtime Verification and Validation Framework for Self-Adaptive Software

    Get PDF
    The concepts that make self-adaptive software attractive also make it more difficult for users to gain confidence that these systems will consistently meet their goals under uncertain context. To improve user confidence in self-adaptive behavior, machine-readable conceptual models have been developed to instrument the adaption behavior of the target software system and primary feedback loop. By comparing these machine-readable models to the self-adaptive system, runtime verification and validation may be introduced as another method to increase confidence in self-adaptive systems; however, the existing conceptual models do not provide the semantics needed to institute this runtime verification or validation. This research confirms that the introduction of runtime verification and validation for self-adaptive systems requires the expansion of existing conceptual models with quality of service metrics, a hierarchy of goals, and states with temporal transitions. Based on this expanded semantics, runtime verification and validation was introduced as a second-level feedback loop to improve the performance of the primary feedback loop and quantitatively measure the quality of service achieved in a state-based, self-adaptive system. A web-based purchasing application running in a cloud-based environment was the focus of experimentation. In order to meet changing customer purchasing demand, the self-adaptive system monitored external context changes and increased or decreased available application servers. The runtime verification and validation system operated as a second-level feedback loop to monitor quality of service goals based on internal context, and corrected self-adaptive behavior when goals are violated. Two competing quality of service goals were introduced to maintain customer satisfaction while minimizing cost. The research demonstrated that the addition of a second-level runtime verification and validation feedback loop did quantitatively improve self-adaptive system performance even with simple, static monitoring rules

    EC-CENTRIC: An Energy- and Context-Centric Perspective on IoT Systems and Protocol Design

    Get PDF
    The radio transceiver of an IoT device is often where most of the energy is consumed. For this reason, most research so far has focused on low power circuit and energy efficient physical layer designs, with the goal of reducing the average energy per information bit required for communication. While these efforts are valuable per se, their actual effectiveness can be partially neutralized by ill-designed network, processing and resource management solutions, which can become a primary factor of performance degradation, in terms of throughput, responsiveness and energy efficiency. The objective of this paper is to describe an energy-centric and context-aware optimization framework that accounts for the energy impact of the fundamental functionalities of an IoT system and that proceeds along three main technical thrusts: 1) balancing signal-dependent processing techniques (compression and feature extraction) and communication tasks; 2) jointly designing channel access and routing protocols to maximize the network lifetime; 3) providing self-adaptability to different operating conditions through the adoption of suitable learning architectures and of flexible/reconfigurable algorithms and protocols. After discussing this framework, we present some preliminary results that validate the effectiveness of our proposed line of action, and show how the use of adaptive signal processing and channel access techniques allows an IoT network to dynamically tune lifetime for signal distortion, according to the requirements dictated by the application

    Energy-Efficient and Reliable Computing in Dark Silicon Era

    Get PDF
    Dark silicon denotes the phenomenon that, due to thermal and power constraints, the fraction of transistors that can operate at full frequency is decreasing in each technology generation. Moore’s law and Dennard scaling had been backed and coupled appropriately for five decades to bring commensurate exponential performance via single core and later muti-core design. However, recalculating Dennard scaling for recent small technology sizes shows that current ongoing multi-core growth is demanding exponential thermal design power to achieve linear performance increase. This process hits a power wall where raises the amount of dark or dim silicon on future multi/many-core chips more and more. Furthermore, from another perspective, by increasing the number of transistors on the area of a single chip and susceptibility to internal defects alongside aging phenomena, which also is exacerbated by high chip thermal density, monitoring and managing the chip reliability before and after its activation is becoming a necessity. The proposed approaches and experimental investigations in this thesis focus on two main tracks: 1) power awareness and 2) reliability awareness in dark silicon era, where later these two tracks will combine together. In the first track, the main goal is to increase the level of returns in terms of main important features in chip design, such as performance and throughput, while maximum power limit is honored. In fact, we show that by managing the power while having dark silicon, all the traditional benefits that could be achieved by proceeding in Moore’s law can be also achieved in the dark silicon era, however, with a lower amount. Via the track of reliability awareness in dark silicon era, we show that dark silicon can be considered as an opportunity to be exploited for different instances of benefits, namely life-time increase and online testing. We discuss how dark silicon can be exploited to guarantee the system lifetime to be above a certain target value and, furthermore, how dark silicon can be exploited to apply low cost non-intrusive online testing on the cores. After the demonstration of power and reliability awareness while having dark silicon, two approaches will be discussed as the case study where the power and reliability awareness are combined together. The first approach demonstrates how chip reliability can be used as a supplementary metric for power-reliability management. While the second approach provides a trade-off between workload performance and system reliability by simultaneously honoring the given power budget and target reliability

    Collaborative, Intelligent, and Adaptive Systems for the Low-Power Internet of Things

    Get PDF
    With the emergence of the Internet of Things (IoT), more and more devices are getting equipped with communication capabilities, often via wireless radios. Their deployments pave the way for new and mission-critical applications: cars will communicate with nearby vehicles to coordinate at intersections; industrial wireless closed-loop systems will improve operational safety in factories; while swarms of drones will coordinate to plan collision-free trajectories. To achieve these goals, IoT devices will need to communicate, coordinate, and collaborate over the wireless medium. However, these envisioned applications necessitate new characteristics that current solutions and protocols cannot fulfill: IoT devices require consistency guarantees from their communication and demand for adaptive behavior in complex and dynamic environments.In this thesis, we design, implement, and evaluate systems and mechanisms to enable safe coordination and adaptivity for the smallest IoT devices. To ensure consistent coordination, we bring fault-tolerant consensus to low-power wireless communication and introduce Wireless Paxos, a flavor of the Paxos algorithm specifically tailored to low-power IoT. We then present STARC, a wireless coordination mechanism for intersection management combining commit semantics with synchronous transmissions. To enable adaptivity in the wireless networking stack, we introduce Dimmer and eAFH. Dimmer combines Reinforcement Learning and Multi-Armed Bandits to adapt its communication parameters and counteract the adverse effects of wireless interference at runtime while optimizing energy consumption in normal conditions. eAFH provides dynamic channel management in Bluetooth Low Energy by excluding and dynamically re-including channels in scenarios with mobility. Finally, we demonstrate with BlueSeer that a device can classify its environment, i.e., recognize whether it is located in a home, office, street, or transport, solely from received Bluetooth Low Energy signals fed into an embedded machine learning model. BlueSeer therefore increases the intelligence of the smallest IoT devices, allowing them to adapt their behaviors to their current surroundings
    • …
    corecore