138 research outputs found

    EXPLOITING THE SYNERGY BETWEEN SCHEDULING AND LOAD SHEDDING TO FACILITATE DIFFERENTIATED LEVELS OF SERVICE FOR CONTINUOUS QUERIES

    Get PDF
    Data Stream Management Systems (DSMSs) offer the most effective solution for processing data streams by efficiently executing continuous queries (CQs) over the incoming data. CQs inherently have different levels of criticality and hence different levels of expected quality of service (QoS) and quality of data (QoD). Adhering to such expected QoS/QoD metrics is even more important in cases of multi-tenant data stream management services. In this dissertation, we propose DILoS, a framework that supports differentiated QoS and QoD for multiple classes of CQs by tightly integrating priority-based scheduling and load shedding. Unlike existing works that consider scheduling and load shedding separately, DILoS is a novel unified framework that exploits the synergy between them. For the realization of DILoS, we propose ALoMa and SEaMLeSS, two general, adaptive load managers. Our load managers can also be used standalone and outperform the state-of-the-art in three dimensions: (1)they automatically tune the headroom factor, (2) they honor the delay target, and (3) they are applicable to complex query networks with shared operators. We implemented DILoS, ALoMa and SEaMLeSS in our real DSMS prototype system (AQSIOS) and systematically evaluate their performance using real and synthetic workloads.Our experimental evaluation of ALoMa and SEaMLeSS verified their advantages over the state-of-the-art approaches. Our evaluation of DILoS showed that it (a) allows the scheduler and load shedder to consistently honor CQs’ priorities, (b) significantly increases system capacity utilization by exploiting batch processing, and (c) enables operator sharing among query classes of different priorities while avoiding priority inversion. To further support differentiated QoS and QoD for CQs in distributed DSMSs, we propose ARMaDILoS, a conceptual framework for large scale adaptive resource management using DILoS. A fundamental component in ARMaDILoS is CQ migration. For this reason, we propose and implement UniMiCo, a protocol to migrate CQs without interrupting the execution of the queries. Our experiments showed that UniMiCo produced correct outputs and did not introduce any hiccup in the response time of the queries

    A PROCRUSTEAN APPROACH TO STREAM PROCESSING

    Get PDF
    The increasing demand for real-time data processing and the constantly growing data volume have contributed to the rapid evolution of Stream Processing Engines (SPEs), which are designed to continuously process data as it arrives. Low operational cost and timely delivery of results are both objectives of paramount importance for SPEs. Given the volatile and uncharted nature of data streams, achieving the aforementioned goals under fixed resources is a challenge. This calls for adaptable SPEs, which can react to fluctuations in processing demands. In the past, three techniques have been developed for improving an SPE’s ability to adapt. Those techniques are classified based on applications’ requirements on exact or approximate results: stream partitioning, and re-partitioning target exact, and load shedding targets approximate processing. Stream partitioning strives to balance load among processors, and previous techniques neglected hidden costs of distributed execution. Load Shedding lowers the accuracy of results by dropping part of the input, and previous techniques did not cope with evolving streams. Stream re-partitioning is used to reconfigure execution while processing takes place, and previous techniques did not fully utilize window semantics. In this dissertation, we put stream processing in a procrustean bed, in terms of the manner and the degree that processing takes place. To this end, we present new approaches, for window-based aggregate operators, which are applicable to both exact and approximate stream processing in modern SPEs. Our stream partitioning, re-partitioning, and load shedding solutions offer improvements in performance and accuracy on real-world data by exploiting the semantics of both data and operations. In addition, we present SPEAr, the design of an SPE that accelerates processing by delivering approximate results with accuracy guarantees and avoiding unnecessary load. Finally, we contribute a hybrid technique, ShedPart, which can further improve load balance and performance of an SPE

    Automatic Generation of Distributed Runtime Infrastructure for Internet of Things

    Get PDF
    Ph. D. ThesisThe Internet of Things (IoT) represents a network of connected devices that are able to cooperate and interact with each other in order to reach a particular goal. To attain this, the devices are equipped with identifying, sensing, networking and processing capabilities. Cloud computing, on the other hand, is the delivering of on-demand computing services – from applications, to storage, to processing power – typically over the internet. Clouds bring a number of advantages to distributed computing because of highly available pool of virtualized computing resource. Due to the large number of connected devices, real-world IoT use cases may generate overwhelmingly large amounts of data. This prompts the use of cloud resources for processing, storage and analysis of the data. Therefore, a typical IoT system comprises of a front-end (devices that collect and transmit data), and back-end – typically distributed Data Stream Management Systems (DSMSs) deployed on the cloud infrastructure, for data processing and analysis. Increasingly, new IoT devices are being manufactured to provide limited execution environment on top of their data sensing and transmitting capabilities. This consequently demands a change in the way data is being processed in a typical IoT-cloud setup. The traditional, centralised cloud-based data processing model – where IoT devices are used only for data collection – does not provide an efficient utilisation of all available resources. In addition, the fundamental requirements of real-time data processing such as short response time may not always be met. This prompts a new processing model which is based on decentralising the data processing tasks. The new decentralised architectural pattern allows some parts of data streaming computation to be executed directly on edge devices – closer to where the data is collected. Extending the processing capabilities to the IoT devices increases the robustness of applications as well as reduces the communication overhead between different components of an IoT system. However, this new pattern poses new challenges in the development, deployment and management of IoT applications. Firstly, there exists a large resource gap between the two parts of a typical IoT system (i.e. clouds and IoT devices); hence, prompting a new approach for IoT applications deployment and management. Secondly, the new decentralised approach necessitates the deployment of DSMS on distributed clusters of heterogeneous nodes resulting in unpredictable runtime performance and complex fault characteristics. Lastly, the environment where DSMSs are deployed is very dynamic due to user or device mobility, workload variation, and resource availability. In this thesis we present solutions to address the aforementioned challenges. We investigate how a high-level description of a data streaming computation can be used to automatically generate a distributed runtime infrastructure for Internet of Things. Subsequently, we develop a deployment and management system capable of distributing different operators of a data streaming computation onto different IoT gateway devices and cloud infrastructure. To address the other challenges, we propose a non-intrusive approach for performance evaluation of DSMSs and present a protocol and a set of algorithms for dynamic migration of stateful data stream operators. To improve our migration approach, we provide an optimisation technique which provides minimal application downtime and improves the accuracy of a data stream computation

    Cloud Computing cost and energy optimization through Federated Cloud SoS

    Get PDF
    2017 Fall.Includes bibliographical references.The two most significant differentiators amongst contemporary Cloud Computing service providers have increased green energy use and datacenter resource utilization. This work addresses these two issues from a system's architectural optimization viewpoint. The proposed approach herein, allows multiple cloud providers to utilize their individual computing resources in three ways by: (1) cutting the number of datacenters needed, (2) scheduling available datacenter grid energy via aggregators to reduce costs and power outages, and lastly by (3) utilizing, where appropriate, more renewable and carbon-free energy sources. Altogether our proposed approach creates an alternative paradigm for a Federated Cloud SoS approach. The proposed paradigm employs a novel control methodology that is tuned to obtain both financial and environmental advantages. It also supports dynamic expansion and contraction of computing capabilities for handling sudden variations in service demand as well as for maximizing usage of time varying green energy supplies. Herein we analyze the core SoS requirements, concept synthesis, and functional architecture with an eye on avoiding inadvertent cascading conditions. We suggest a physical architecture that diminishes unwanted outcomes while encouraging desirable results. Finally, in our approach, the constituent cloud services retain their independent ownership, objectives, funding, and sustainability means. This work analyzes the core SoS requirements, concept synthesis, and functional architecture. It suggests a physical structure that simulates the primary SoS emergent behavior to diminish unwanted outcomes while encouraging desirable results. The report will analyze optimal computing generation methods, optimal energy utilization for computing generation as well as a procedure for building optimal datacenters using a unique hardware computing system design based on the openCompute community as an illustrative collaboration platform. Finally, the research concludes with security features cloud federation requires to support to protect its constituents, its constituents tenants and itself from security risks

    Fourth NASA Goddard Conference on Mass Storage Systems and Technologies

    Get PDF
    This report contains copies of all those technical papers received in time for publication just prior to the Fourth Goddard Conference on Mass Storage and Technologies, held March 28-30, 1995, at the University of Maryland, University College Conference Center, in College Park, Maryland. This series of conferences continues to serve as a unique medium for the exchange of information on topics relating to the ingestion and management of substantial amounts of data and the attendant problems involved. This year's discussion topics include new storage technology, stability of recorded media, performance studies, storage system solutions, the National Information infrastructure (Infobahn), the future for storage technology, and lessons learned from various projects. There also will be an update on the IEEE Mass Storage System Reference Model Version 5, on which the final vote was taken in July 1994

    Adaptation-Aware Architecture Modeling and Analysis of Energy Efficiency for Software Systems

    Get PDF
    This thesis presents an approach for the design time analysis of energy efficiency for static and self-adaptive software systems. The quality characteristics of a software system, such as performance and operating costs, strongly depend upon its architecture. Software architecture is a high-level view on software artifacts that reflects essential quality characteristics of a system under design. Design decisions made on an architectural level have a decisive impact on the quality of a system. Revising architectural design decisions late into development requires significant effort. Architectural analyses allow software architects to reason about the impact of design decisions on quality, based on an architectural description of the system. An essential quality goal is the reduction of cost while maintaining other quality goals. Power consumption accounts for a significant part of the Total Cost of Ownership (TCO) of data centers. In 2010, data centers contributed 1.3% of the world-wide power consumption. However, reasoning on the energy efficiency of software systems is excluded from the systematic analysis of software architectures at design time. Energy efficiency can only be evaluated once the system is deployed and operational. One approach to reduce power consumption or cost is the introduction of self-adaptivity to a software system. Self-adaptive software systems execute adaptations to provision costly resources dependent on user load. The execution of reconfigurations can increase energy efficiency and reduce cost. If performed improperly, however, the additional resources required to execute a reconfiguration may exceed their positive effect. Existing architecture-level energy analysis approaches offer limited accuracy or only consider a limited set of system features, e.g., the used communication style. Predictive approaches from the embedded systems and Cloud Computing domain operate on an abstraction that is not suited for architectural analysis. The execution of adaptations can consume additional resources. The additional consumption can reduce performance and energy efficiency. Design time quality analyses for self-adaptive software systems ignore this transient effect of adaptations. This thesis makes the following contributions to enable the systematic consideration of energy efficiency in the architectural design of self-adaptive software systems: First, it presents a modeling language that captures power consumption characteristics on an architectural abstraction level. Second, it introduces an energy efficiency analysis approach that uses instances of our power consumption modeling language in combination with existing performance analyses for architecture models. The developed analysis supports reasoning on energy efficiency for static and self-adaptive software systems. Third, to ease the specification of power consumption characteristics, we provide a method for extracting power models for server environments. The method encompasses an automated profiling of servers based on a set of restrictions defined by the user. A model training framework extracts a set of power models specified in our modeling language from the resulting profile. The method ranks the trained power models based on their predicted accuracy. Lastly, this thesis introduces a systematic modeling and analysis approach for considering transient effects in design time quality analyses. The approach explicitly models inter-dependencies between reconfigurations, performance and power consumption. We provide a formalization of the execution semantics of the model. Additionally, we discuss how our approach can be integrated with existing quality analyses of self-adaptive software systems. We validated the accuracy, applicability, and appropriateness of our approach in a variety of case studies. The first two case studies investigated the accuracy and appropriateness of our modeling and analysis approach. The first study evaluated the impact of design decisions on the energy efficiency of a media hosting application. The energy consumption predictions achieved an absolute error lower than 5.5% across different user loads. Our approach predicted the relative impact of the design decision on energy efficiency with an error of less than 18.94%. The second case study used two variants of the Spring-based community case study system PetClinic. The case study complements the accuracy and appropriateness evaluation of our modeling and analysis approach. We were able to predict the energy consumption of both variants with an absolute error of no more than 2.38%. In contrast to the first case study, we derived all models automatically, using our power model extraction framework, as well as an extraction framework for performance models. The third case study applied our model-based prediction to evaluate the effect of different self-adaptation algorithms on energy efficiency. It involved scientific workloads executed in a virtualized environment. Our approach predicted the energy consumption with an error below 7.1%, even though we used coarse grained measurement data of low accuracy to train the input models. The fourth case study evaluated the appropriateness and accuracy of the automated model extraction method using a set of Big Data and enterprise workloads. Our method produced power models with prediction errors below 5.9%. A secondary study evaluated the accuracy of extracted power models for different Virtual Machine (VM) migration scenarios. The results of the fifth case study showed that our approach for modeling transient effects improved the prediction accuracy for a horizontally scaling application. Leveraging the improved accuracy, we were able to identify design deficiencies of the application that otherwise would have remained unnoticed

    Understanding Security Threats in Cloud

    Get PDF
    As cloud computing has become a trend in the computing world, understanding its security concerns becomes essential for improving service quality and expanding business scale. This dissertation studies the security issues in a public cloud from three aspects. First, we investigate a new threat called power attack in the cloud. Second, we perform a systematical measurement on the public cloud to understand how cloud vendors react to existing security threats. Finally, we propose a novel technique to perform data reduction on audit data to improve system capacity, and hence helping to enhance security in cloud. In the power attack, we exploit various attack vectors in platform as a service (PaaS), infrastructure as a service (IaaS), and software as a service (SaaS) cloud environments. to demonstrate the feasibility of launching a power attack, we conduct series of testbed based experiments and data-center-level simulations. Moreover, we give a detailed analysis on how different power management methods could affect a power attack and how to mitigate such an attack. Our experimental results and analysis show that power attacks will pose a serious threat to modern data centers and should be taken into account while deploying new high-density servers and power management techniques. In the measurement study, we mainly investigate how cloud vendors have reacted to the co-residence threat inside the cloud, in terms of Virtual Machine (VM) placement, network management, and Virtual Private Cloud (VPC). Specifically, through intensive measurement probing, we first profile the dynamic environment of cloud instances inside the cloud. Then using real experiments, we quantify the impacts of VM placement and network management upon co-residence, respectively. Moreover, we explore VPC, which is a defensive service of Amazon EC2 for security enhancement, from the routing perspective. Advanced Persistent Threat (APT) is a serious cyber-threat, cloud vendors are seeking solutions to ``connect the suspicious dots\u27\u27 across multiple activities. This requires ubiquitous system auditing for long period of time, which in turn causes overwhelmingly large amount of system audit logs. We propose a new approach that exploits the dependency among system events to reduce the number of log entries while still supporting high quality forensics analysis. In particular, we first propose an aggregation algorithm that preserves the event dependency in data reduction to ensure high quality of forensic analysis. Then we propose an aggressive reduction algorithm and exploit domain knowledge for further data reduction. We conduct a comprehensive evaluation on real world auditing systems using more than one-month log traces to validate the efficacy of our approach

    Test, Control and Monitor System (TCMS) operations plan

    Get PDF
    The purpose is to provide a clear understanding of the Test, Control and Monitor System (TCMS) operating environment and to describe the method of operations for TCMS. TCMS is a complex and sophisticated checkout system focused on support of the Space Station Freedom Program (SSFP) and related activities. An understanding of the TCMS operating environment is provided and operational responsibilities are defined. NASA and the Payload Ground Operations Contractor (PGOC) will use it as a guide to manage the operation of the TCMS computer systems and associated networks and workstations. All TCMS operational functions are examined. Other plans and detailed operating procedures relating to an individual operational function are referenced within this plan. This plan augments existing Technical Support Management Directives (TSMD's), Standard Practices, and other management documentation which will be followed where applicable
    • …
    corecore