608 research outputs found

    Exploiting cost-performance tradeoffs for modern cloud systems

    Get PDF
    The trade-off between cost and performance is a fundamental challenge for modern cloud systems. This thesis explores cost-performance tradeoffs for three types of systems that permeate today's clouds, namely (1) storage, (2) virtualization, and (3) computation. A distributed key-value storage system must choose between the cost of keeping replicas synchronized (consistency) and performance (latency) or read/write operations. A cloud-based disaster recovery system can reduce the cost of managing a group of VMs as a single unit for recovery by implementing this abstraction in software (instead of hardware) at the risk of impacting application availability performance. As another example, run-time performance of graph analytics jobs sharing a multi-tenant cluster can be made better by trading of the cost of replication of the input graph data-set stored in the associated distributed file system. Today cloud system providers have to manually tune the system to meet desired trade-offs. This can be challenging since the optimal trade-off between cost and performance may vary depending on network and workload conditions. Thus our hypothesis is that it is feasible to imbue a wide variety of cloud systems with adaptive and opportunistic mechanisms to efficiently navigate the cost-performance tradeoff space to meet desired tradeoffs. The types of cloud systems considered in this thesis include key-value stores, cloud-based disaster recovery systems, and multi-tenant graph computation engines. Our first contribution, PCAP is an adaptive distributed storage system. The foundation of the PCAP system is a probabilistic variation of the classical CAP theorem, which quantifies the (un-)achievable envelope of probabilistic consistency and latency under different network conditions characterized by a probabilistic partition model. Our PCAP system proposes adaptive mechanisms for tuning control knobs to meet desired consistency-latency tradeoffs expressed in terms in service-level agreements. Our second system, GeoPCAP is a geo-distributed extension of PCAP. In GeoPCAP, we propose generalized probabilistic composition rules for composing consistency-latency tradeoffs across geo-distributed instances of distributed key-value stores, each running on separate data-centers. GeoPCAP also includes a geo-distributed adaptive control system that adapts new controls knobs to meet SLAs across geo-distributed data-centers. Our third system, GCVM proposes a light-weight hypervisor-managed mechanism for taking crash consistent snapshots across VMs distributed over servers. This mechanism enables us to move the consistency group abstraction from hardware to software, and thus lowers reconfiguration cost while incurring modest VM pause times which impact application availability. Finally, our fourth contribution is a new opportunistic graph processing system called OPTiC for efficiently scheduling multiple graph analytics jobs sharing a multi-tenant cluster. By opportunistically creating at most 1 additional replica in the distributed file system (thus incurring cost), we show up to 50% reduction in median job completion time for graph processing jobs under realistic network and workload conditions. Thus with a modest increase in storage and bandwidth cost in disk, we can reduce job completion time (improve performance). For the first two systems (PCAP, and GeoPCAP), we exploit the cost-performance tradeoff space through efficient navigation of the tradeoff space to meet SLAs and perform close to the optimal tradeoff. For the third (GCVM) and fourth (OPTiC) systems, we move from one solution point to another solution point in the tradeoff space. For the last two systems, explicitly mapping out the tradeoff space allows us to consider new design tradeoffs for these systems

    Reliable causal delivery with probabilistic design

    Get PDF
    Ensuring reliable and ordered communication between computers usually requires acknowledgment messages. In systems with a high rate of broadcast communication, the cost of such acknowledgment messages can be large. We propose to use the causal ordering information required by some applications to detect and request missing messages. To circumscribe the number of unnecessary requests we combine local awareness and probabilistic methods. Our model allow to obtain reliable communication within a latency equivalent to unordered communication and lower network usage than acknowledgment systems.Assurer une communication ordonnée et fiable entre ordinateurs requière usuellement l'utilisation de messages d'accusé de réception. Dans les systèmes ayant un rythme élevé de communication d'ensemble, la charge de ces accusés de réception sur le réseau peut-être importante. Nous proposons d'utiliser les méta-données permettant l'ordonnancement causal pour détecter et récupérer les messages perdus. Afin de limiter le nombre de récupération inutiles nous combinons une connaissance locale du comportement du système ainsi que des méthodes probabilistes. Notre modèle nous permet d'obtenir une communication fiable avec des latences équivalente à une communication non-ordonnée et une charge réseau plus faible que les systèmes classique d'accusé de réception

    Causal Consistency and Latency Optimality: Friend or Foe? [Extended Version]

    Get PDF
    Causal consistency is an attractive consistency model for geo-replicated data stores. It is provably the strongest model that tolerates network partitions. It avoids the long latencies associated with strong consistency, and, especially when using read-only transactions (ROTs), it prevents many of the anomalies of weaker consistency models. Recent work has shown that causal consistency allows "latency-optimal'' ROTs, that are nonblocking, single-round and single-version in terms of communication. On the surface, this latency optimality is very appealing, as the vast majority of applications are assumed to have read-dominated workloads. In this paper, we show that such "latency-optimal'' ROTs induce an extra overhead on writes that is so high that it actually jeopardizes performance even in read-dominated workloads. We show this result from a practical as well as from a theoretical angle. We present the Contrarian protocol that implements "almost latency-optimal'' ROTs, but that does not impose on the writes any of the overheads present in latency-optimal protocols. In Contrarian, ROTs are nonblocking and single-version, but they require two rounds of client-server communication. We experimentally show that this protocol not only achieves higher throughput, but, surprisingly, also provides better latencies for all but the lowest loads and the most read-heavy workloads. We furthermore prove that the extra overhead imposed on writes by latency-optimal ROTs is inherent, i.e., it is not an artifact of the design we consider, and cannot be avoided by any implementation of latency-optimal ROTs. We show in particular that this overhead grows linearly with the number of clients

    Designing Scalable Mechanisms for Geo-Distributed Platform Services in the Presence of Client Mobility

    Get PDF
    Situation-awareness applications require low-latency response and high network bandwidth, hence benefiting from geo-distributed Edge infrastructures. The developers of these applications typically rely on several platform services, such as Kubernetes, Apache Cassandra and Pulsar, for managing their compute and data components across the geo-distributed Edge infrastructure. Situation-awareness applications impose peculiar requirements on the compute and data placement policies of the platform services. Firstly, the processing logic of these applications is closely tied to the physical environment that it is interacting with. Hence, the access pattern to compute and data exhibits strong spatial affinity. Secondly, the network topology of Edge infrastructure is heterogeneous, wherein communication latency forms a significant portion of the end-to-end compute and data access latency. Therefore, the placement of compute and data components has to be cognizant of the spatial affinity and latency requirements of the applications. However, clients of situation-awareness applications, such as vehicles and drones, are typically mobile – making the compute and data access pattern dynamic and complicating the management of data and compute components. Constant changes in the network connectivity and spatial locality of clients due to client mobility results in making the current placement of compute and data components unsuitable for meeting the latency and spatial affinity requirements of the application. Constant client mobility necessitates that client location and latency offered by the platform services be continuously monitored to detect when application requirements are violated and to adapt the compute and data placement. The control and monitoring modules of off-the-shelf platform services do not have the necessary primitives to incorporate spatial affinity and network topology awareness into their compute and data placement policies. The spatial location of clients is not considered as an input for decision- making in their control modules. Furthermore, they do not perform fine-grained end-to-end monitoring of observed latency to detect and adapt to performance degradations due to client mobility. This dissertation presents three mechanisms that inform the compute and data placement policies of platform services, so that application requirements can be met. M1: Dynamic Spatial Context Management for system entities – clients and data and compute components – to ensure spatial affinity requirements are satisfied. M2: Network Proximity Estimation to provide topology-awareness to the data and compute placement policies of platform services. M3: End-to-End Latency Monitoring to enable collection, aggregation and analysis of per-application metrics in a geo-distributed manner to provide end-to-end insights into application performance. The thesis of our work is that the aforementioned mechanisms are fundamental building blocks for the compute and data management policies of platform services, and that by incorporating them, platform services can meet application requirements at the Edge. Furthermore, the proposed mechanisms can be implemented in a way that offers high scalability to handle high levels of client activity. We demonstrate by construction the efficacy and scalability of the proposed mechanisms for building dynamic compute and data orchestration policies by incorporating them in the control and monitoring modules of three different platform services. Specifically, we incorporate these mechanisms into a topic-based publish-subscribe system (ePulsar), an application orchestration platform (OneEdge), and a key-value store (FogStore). We conduct extensive performance evaluation of these enhanced platform services to showcase how the new mechanisms aid in dynamically adapting the compute/data orchestration decisions to satisfy performance requirements of applicationsPh.D
    • …
    corecore