17 research outputs found

    Exploiting cost-performance tradeoffs for modern cloud systems

    Get PDF
    The trade-off between cost and performance is a fundamental challenge for modern cloud systems. This thesis explores cost-performance tradeoffs for three types of systems that permeate today's clouds, namely (1) storage, (2) virtualization, and (3) computation. A distributed key-value storage system must choose between the cost of keeping replicas synchronized (consistency) and performance (latency) or read/write operations. A cloud-based disaster recovery system can reduce the cost of managing a group of VMs as a single unit for recovery by implementing this abstraction in software (instead of hardware) at the risk of impacting application availability performance. As another example, run-time performance of graph analytics jobs sharing a multi-tenant cluster can be made better by trading of the cost of replication of the input graph data-set stored in the associated distributed file system. Today cloud system providers have to manually tune the system to meet desired trade-offs. This can be challenging since the optimal trade-off between cost and performance may vary depending on network and workload conditions. Thus our hypothesis is that it is feasible to imbue a wide variety of cloud systems with adaptive and opportunistic mechanisms to efficiently navigate the cost-performance tradeoff space to meet desired tradeoffs. The types of cloud systems considered in this thesis include key-value stores, cloud-based disaster recovery systems, and multi-tenant graph computation engines. Our first contribution, PCAP is an adaptive distributed storage system. The foundation of the PCAP system is a probabilistic variation of the classical CAP theorem, which quantifies the (un-)achievable envelope of probabilistic consistency and latency under different network conditions characterized by a probabilistic partition model. Our PCAP system proposes adaptive mechanisms for tuning control knobs to meet desired consistency-latency tradeoffs expressed in terms in service-level agreements. Our second system, GeoPCAP is a geo-distributed extension of PCAP. In GeoPCAP, we propose generalized probabilistic composition rules for composing consistency-latency tradeoffs across geo-distributed instances of distributed key-value stores, each running on separate data-centers. GeoPCAP also includes a geo-distributed adaptive control system that adapts new controls knobs to meet SLAs across geo-distributed data-centers. Our third system, GCVM proposes a light-weight hypervisor-managed mechanism for taking crash consistent snapshots across VMs distributed over servers. This mechanism enables us to move the consistency group abstraction from hardware to software, and thus lowers reconfiguration cost while incurring modest VM pause times which impact application availability. Finally, our fourth contribution is a new opportunistic graph processing system called OPTiC for efficiently scheduling multiple graph analytics jobs sharing a multi-tenant cluster. By opportunistically creating at most 1 additional replica in the distributed file system (thus incurring cost), we show up to 50% reduction in median job completion time for graph processing jobs under realistic network and workload conditions. Thus with a modest increase in storage and bandwidth cost in disk, we can reduce job completion time (improve performance). For the first two systems (PCAP, and GeoPCAP), we exploit the cost-performance tradeoff space through efficient navigation of the tradeoff space to meet SLAs and perform close to the optimal tradeoff. For the third (GCVM) and fourth (OPTiC) systems, we move from one solution point to another solution point in the tradeoff space. For the last two systems, explicitly mapping out the tradeoff space allows us to consider new design tradeoffs for these systems

    Service Level Agreement-based adaptation management for Internet Service Provider (ISP) using Fuzzy Q-learning

    Get PDF
    Internet access is the vital catalyst for online users, and the number of mobile subscribers is predicted to grow from dramatically in the next few years. This huge demand is the main issue facing the Internet Service Providers (ISPs) who need to handle users’ expectations along with their current resources. An adaptive mechanism within the ISPs architecture is a promising solution to handle such situation. A Service Level Agreement (SLA)is the legal catalyst to monitor any contract violation between end users and ISPs and is embedded within a Quality of Service (QoS) framework. It strengthens and advances the quality of control over the user’s application and network resources and can be further stretched to fulfill the QoS terms through negotiation and re-negotiation. Moreover, the present literature does not focus on the combination of rule-based approaches and adaptation together to update the established learning repository. Therefore, this mainstream of this research in the context of SLAs is to fill in this gap by addressing the combination of rule-base uncertainties and iteration of the learning ability. The key to the proposed architecture is the utilization of self - * capabilities designed to have self-management over uncertainties and the provision of self-adaptive interactions. Thus, the Monitor, Analyse, Plan, Execute and Knowledge Base (MAPE-K) approach is able to deal with this problem together with the integration of Fuzzy and Q-Learning algorithms. The proposed architecture is in the context of autonomic computing. An adaptation manager is the main proposed component to update admission control on the ISP current resources and the ability to manage SLAs. A general methodology type-2 fuzzy logic is applied to ensure the uncertainties and precise decision-making are well addressed in this research. The proposed solution, demonstrating Q-Learning works adaptive with QoS parameters, e.g. Latency, Availability and Packet Loss. With the combination of fuzzy and Q-Learning, we demonstrate that the proposed adaptation manager is able to handle the uncertainties and learning abilities. Q-Learning is able to identify the initial state from various ISPs iterations and update them with appropriate actions, reflecting the reward configurations. The higher the iterations process the higher is the increase the learning ability,rewards and exploration probability. The research outcomes benefit the SLA framework by incorporating the information for SLA policies and Service Level Objectives (SLOs). Lastly, an important contribution is the ability to demonstrate that the MAPE-K approach is a contender for ISP SLA-based frameworks for QoS provision

    Edge/Fog Computing Technologies for IoT Infrastructure

    Get PDF
    The prevalence of smart devices and cloud computing has led to an explosion in the amount of data generated by IoT devices. Moreover, emerging IoT applications, such as augmented and virtual reality (AR/VR), intelligent transportation systems, and smart factories require ultra-low latency for data communication and processing. Fog/edge computing is a new computing paradigm where fully distributed fog/edge nodes located nearby end devices provide computing resources. By analyzing, filtering, and processing at local fog/edge resources instead of transferring tremendous data to the centralized cloud servers, fog/edge computing can reduce the processing delay and network traffic significantly. With these advantages, fog/edge computing is expected to be one of the key enabling technologies for building the IoT infrastructure. Aiming to explore the recent research and development on fog/edge computing technologies for building an IoT infrastructure, this book collected 10 articles. The selected articles cover diverse topics such as resource management, service provisioning, task offloading and scheduling, container orchestration, and security on edge/fog computing infrastructure, which can help to grasp recent trends, as well as state-of-the-art algorithms of fog/edge computing technologies

    Systems Support for Trusted Execution Environments

    Get PDF
    Cloud computing has become a default choice for data processing by both large corporations and individuals due to its economy of scale and ease of system management. However, the question of trust and trustoworthy computing inside the Cloud environments has been long neglected in practice and further exacerbated by the proliferation of AI and its use for processing of sensitive user data. Attempts to implement the mechanisms for trustworthy computing in the cloud have previously remained theoretical due to lack of hardware primitives in the commodity CPUs, while a combination of Secure Boot, TPMs, and virtualization has seen only limited adoption. The situation has changed in 2016, when Intel introduced the Software Guard Extensions (SGX) and its enclaves to the x86 ISA CPUs: for the first time, it became possible to build trustworthy applications relying on a commonly available technology. However, Intel SGX posed challenges to the practitioners who discovered the limitations of this technology, from the limited support of legacy applications and integration of SGX enclaves into the existing system, to the performance bottlenecks on communication, startup, and memory utilization. In this thesis, our goal is enable trustworthy computing in the cloud by relying on the imperfect SGX promitives. To this end, we develop and evaluate solutions to issues stemming from limited systems support of Intel SGX: we investigate the mechanisms for runtime support of POSIX applications with SCONE, an efficient SGX runtime library developed with performance limitations of SGX in mind. We further develop this topic with FFQ, which is a concurrent queue for SCONE's asynchronous system call interface. ShieldBox is our study of interplay of kernel bypass and trusted execution technologies for NFV, which also tackles the problem of low-latency clocks inside enclave. The two last systems, Clemmys and T-Lease are built on a more recent SGXv2 ISA extension. In Clemmys, SGXv2 allows us to significantly reduce the startup time of SGX-enabled functions inside a Function-as-a-Service platform. Finally, in T-Lease we solve the problem of trusted time by introducing a trusted lease primitive for distributed systems. We perform evaluation of all of these systems and prove that they can be practically utilized in existing systems with minimal overhead, and can be combined with both legacy systems and other SGX-based solutions. In the course of the thesis, we enable trusted computing for individual applications, high-performance network functions, and distributed computing framework, making a <vision of trusted cloud computing a reality

    Advances in Information Security and Privacy

    Get PDF
    With the recent pandemic emergency, many people are spending their days in smart working and have increased their use of digital resources for both work and entertainment. The result is that the amount of digital information handled online is dramatically increased, and we can observe a significant increase in the number of attacks, breaches, and hacks. This Special Issue aims to establish the state of the art in protecting information by mitigating information risks. This objective is reached by presenting both surveys on specific topics and original approaches and solutions to specific problems. In total, 16 papers have been published in this Special Issue

    Semantic discovery and reuse of business process patterns

    Get PDF
    Patterns currently play an important role in modern information systems (IS) development and their use has mainly been restricted to the design and implementation phases of the development lifecycle. Given the increasing significance of business modelling in IS development, patterns have the potential of providing a viable solution for promoting reusability of recurrent generalized models in the very early stages of development. As a statement of research-in-progress this paper focuses on business process patterns and proposes an initial methodological framework for the discovery and reuse of business process patterns within the IS development lifecycle. The framework borrows ideas from the domain engineering literature and proposes the use of semantics to drive both the discovery of patterns as well as their reuse
    corecore