342 research outputs found

    An analysis of software aging in cloud environment

    Get PDF
    Cloud Computing is the environment in which several virtual machines (VM) run concurrently on physical machines. The cloud computing infrastructure hosts multiple cloud service segments that communicate with each other using the interfaces. This creates distributed computing environment. During operation, the software systems accumulate errors or garbage that leads to system failure and other hazardous consequences. This status is called software aging. Software aging happens because of memory fragmentation, resource consumption in large scale and accumulation of numerical error. Software aging degrads the performance that may result in system failure. This happens because of premature resource exhaustion. This issue cannot be determined during software testing phase because of the dynamic nature of operation. The errors that cause software aging are of special types. These errors do not disturb the software functionality but target the response time and its environment. This issue is to be resolved only during run time as it occurs because of the dynamic nature of the problem. To alleviate the impact of software aging, software rejuvenation technique is being used. Rejuvenation process reboots the system or re-initiates the softwares. This avoids faults or failure. Software rejuvenation removes accumulated error conditions, frees up deadlocks and defragments operating system resources like memory. Hence, it avoids future failures of system that may happen due to software aging. As service availability is crucial, software rejuvenation is to be carried out at defined schedules without disrupting the service. The presence of Software rejuvenation techniques can make software systems more trustworthy. Software designers are using this concept to improve the quality and reliability of the software. Software aging and rejuvenation has generated a lot of research interest in recent years. This work reviews some of the research works related to detection of software aging and identifies research gaps

    Software aging prediction – a new approach

    Get PDF
    To meet the users’ requirements which are very diverse in recent days, computing infrastructure has become complex. An example of one such infrastructure is a cloud-based system. These systems suffer from resource exhaustion in the long run which leads to performance degradation. This phenomenon is called software aging. There is a need to predict software aging to carry out pre-emptive rejuvenation that enhances service availability. Software rejuvenation is the technique that refreshes the system and brings it back to a healthy state. Hence, software aging should be predicted in advance to trigger the rejuvenation process to improve service availability. In this work, the k-nearest neighbor (k-NN) algorithm-based new approach has been used to identify the virtual machine's status, and a prediction of resource exhaustion time has been made. The proposed prediction model uses static thresholding and adaptive thresholding methods. The performance of the algorithms is compared, and it is found that for classification, the k-NN performs comparatively better, i.e., k-NN showed an accuracy of 97.6. In contrast, its counterparts performed with an accuracy of 96.0 (naïve Bayes) and 92.8 (decision tree). The comparison of the proposed work with previous similar works has also been discussed

    Reboot-based Recovery of Performance Anomalies in Adaptive Bitrate Video-Streaming Services

    Get PDF
    Performance anomalies represent one common type of failures in Internet servers. Overcoming these failures without introducing server downtimes is of the utmost importance in video-streaming services. These services have large user abandon- ment costs when failures occur after users watch a significant part of a video. Reboot is the most popular and effective technique for overcoming performance anomalies but it takes several minutes from start until the server is warmed-up again to run at its full capacity. During that period, the server is unavailable or provides limited capacity to process end-users’ requests. This paper presents a recovery technique for performance anomalies in HTTP Streaming services, which relies on Container-based Virtualization to implement an efficient multi-phase server reboot technique that minimizes the service downtime. The recovery process includes analysis of variance of request-response times to delimit the server warm-up period, after which the server is running at its full capacity. Experimental results show that the Virtual Container recovery process completes in 72 seconds, which contrasts with the 434 seconds required for full operating system recovery. Both recovery types generate service downtimes imperceptible to end-users

    A framework for Model-Driven Engineering of resilient software-controlled systems

    Get PDF
    AbstractEmergent paradigms of Industry 4.0 and Industrial Internet of Things expect cyber-physical systems to reliably provide services overcoming disruptions in operative conditions and adapting to changes in architectural and functional requirements. In this paper, we describe a hardware/software framework supporting operation and maintenance of software-controlled systems enhancing resilience by promoting a Model-Driven Engineering (MDE) process to automatically derive structural configurations and failure models from reliability artifacts. Specifically, a reflective architecture developed around digital twins enables representation and control of system Configuration Items properly derived from SysML Block Definition Diagrams, providing support for variation. Besides, a plurality of distributed analytic agents for qualitative evaluation over executable failure models empowers the system with runtime self-assessment and dynamic adaptation capabilities. We describe the framework architecture outlining roles and responsibilities in a System of Systems perspective, providing salient design traits about digital twins and data analytic agents for failure propagation modeling and analysis. We discuss a prototype implementation following the MDE approach, highlighting self-recovery and self-adaptation properties on a real cyber-physical system for vehicle access control to Limited Traffic Zones

    Phase-based adaptive recompilation in a JVM

    Full text link

    DEPENDABILITY IN CLOUD COMPUTING

    Get PDF
    The technological advances and success of Service-Oriented Architectures and the Cloud computing paradigm have produced a revolution in the Information and Communications Technology (ICT). Today, a wide range of services are provisioned to the users in a flexible and cost-effective manner, thanks to the encapsulation of several technologies with modern business models. These services not only offer high-level software functionalities such as social networks or e-commerce but also middleware tools that simplify application development and low-level data storage, processing, and networking resources. Hence, with the advent of the Cloud computing paradigm, today's ICT allows users to completely outsource their IT infrastructure and benefit significantly from the economies of scale. At the same time, with the widespread use of ICT, the amount of data being generated, stored and processed by private companies, public organizations and individuals is rapidly increasing. The in-house management of data and applications is proving to be highly cost intensive and Cloud computing is becoming the destination of choice for increasing number of users. As a consequence, Cloud computing services are being used to realize a wide range of applications, each having unique dependability and Quality-of-Service (Qos) requirements. For example, a small enterprise may use a Cloud storage service as a simple backup solution, requiring high data availability, while a large government organization may execute a real-time mission-critical application using the Cloud compute service, requiring high levels of dependability (e.g., reliability, availability, security) and performance. Service providers are presently able to offer sufficient resource heterogeneity, but are failing to satisfy users' dependability requirements mainly because the failures and vulnerabilities in Cloud infrastructures are a norm rather than an exception. This thesis provides a comprehensive solution for improving the dependability of Cloud computing -- so that -- users can justifiably trust Cloud computing services for building, deploying and executing their applications. A number of approaches ranging from the use of trustworthy hardware to secure application design has been proposed in the literature. The proposed solution consists of three inter-operable yet independent modules, each designed to improve dependability under different system context and/or use-case. A user can selectively apply either a single module or combine them suitably to improve the dependability of her applications both during design time and runtime. Based on the modules applied, the overall proposed solution can increase dependability at three distinct levels. In the following, we provide a brief description of each module. The first module comprises a set of assurance techniques that validates whether a given service supports a specified dependability property with a given level of assurance, and accordingly, awards it a machine-readable certificate. To achieve this, we define a hierarchy of dependability properties where a property represents the dependability characteristics of the service and its specific configuration. A model of the service is also used to verify the validity of the certificate using runtime monitoring, thus complementing the dynamic nature of the Cloud computing infrastructure and making the certificate usable both at discovery and runtime. This module also extends the service registry to allow users to select services with a set of certified dependability properties, hence offering the basic support required to implement dependable applications. We note that this module directly considers services implemented by service providers and provides awareness tools that allow users to be aware of the QoS offered by potential partner services. We denote this passive technique as the solution that offers first level of dependability in this thesis. Service providers typically implement a standard set of dependability mechanisms that satisfy the basic needs of most users. Since each application has unique dependability requirements, assurance techniques are not always effective, and a pro-active approach to dependability management is also required. The second module of our solution advocates the innovative approach of offering dependability as a service to users' applications and realizes a framework containing all the mechanisms required to achieve this. We note that this approach relieves users from implementing low-level dependability mechanisms and system management procedures during application development and satisfies specific dependability goals of each application. We denote the module offering dependability as a service as the solution that offers second level of dependability in this thesis. The third, and the last, module of our solution concerns secure application execution. This module considers complex applications and presents advanced resource management schemes that deploy applications with improved optimality when compared to the algorithms of the second module. This module improves dependability of a given application by minimizing its exposure to existing vulnerabilities, while being subject to the same dependability policies and resource allocation conditions as in the second module. Our approach to secure application deployment and execution denotes the third level of dependability offered in this thesis. The contributions of this thesis can be summarized as follows.The contributions of this thesis can be summarized as follows. \u2022 With respect to assurance techniques our contributions are: i) de finition of a hierarchy of dependability properties, an approach to service modeling, and a model transformation scheme; ii) de finition of a dependability certifi cation scheme for services; iii) an approach to service selection that considers users' dependability requirements; iv) de finition of a solution to dependability certifi cation of composite services, where the dependability properties of a composite service are calculated on the basis of the dependability certi ficates of component services. \u2022 With respect to off ering dependability as a service our contributions are: i) de finition of a delivery scheme that transparently functions on users' applications and satisfi es their dependability requirements; ii) design of a framework that encapsulates all the components necessary to o er dependability as a service to the users; iii) an approach to translate high level users' requirements to low level dependability mechanisms; iv) formulation of constraints that allow enforcement of deployment conditions inherent to dependability mechanisms and an approach to satisfy such constraints during resource allocation; v) a resource management scheme that masks the a ffect of system changes by adapting the current allocation of the application. \u2022 With respect to security management our contributions are: i) an approach that deploys users' applications in the Cloud infrastructure such that their exposure to vulnerabilities is minimized; ii) an approach to build interruptible elastic algorithms whose optimality improves as the processing time increases, eventually converging to an optimal solution
    • …
    corecore