23 research outputs found

    Lifeguard: Local Health Awareness for More Accurate Failure Detection

    Full text link
    SWIM is a peer-to-peer group membership protocol with attractive scaling and robustness properties. However, slow message processing can cause SWIM to mark healthy members as failed (so called false positive failure detection), despite inclusion of a mechanism to avoid this. We identify the properties of SWIM that lead to the problem, and propose Lifeguard, a set of extensions to SWIM which consider that the local failure detector module may be at fault, via the concept of local health. We evaluate this approach in a precisely controlled environment and validate it in a real-world scenario, showing that it drastically reduces the rate of false positives. The false positive rate and detection time for true failures can be reduced simultaneously, compared to the baseline levels of SWIM

    An Analysis of Distributed Systems Syllabi With a Focus on Performance-Related Topics

    Get PDF
    We analyze a dataset of 51 current (2019-2020) Distributed Systems syllabi from top Computer Science programs, focusing on finding the prevalence and context in which topics related to performance are being taught in these courses. We also study the scale of the infrastructure mentioned in DS courses, from small client-server systems to cloud-scale, peer-to-peer, global-scale systems. We make eight main findings, covering goals such as performance, and scalability and its variant elasticity; activities such as performance benchmarking and monitoring; eight selected performance-enhancing techniques (replication, caching, sharding, load balancing, scheduling, streaming, migrating, and offloading); and control issues such as trade-offs that include performance and performance variability.Comment: Accepted for publication at WEPPE 2021, to be held in conjunction with ACM/SPEC ICPE 2021: https://doi.org/10.1145/3447545.3451197 This article is a follow-up of our prior ACM SIGCSE publication, arXiv:2012.0055

    RoBuSt: A Crash-Failure-Resistant Distributed Storage System

    Full text link
    In this work we present the first distributed storage system that is provably robust against crash failures issued by an adaptive adversary, i.e., for each batch of requests the adversary can decide based on the entire system state which servers will be unavailable for that batch of requests. Despite up to γn1/loglogn\gamma n^{1/\log\log n} crashed servers, with γ>0\gamma>0 constant and nn denoting the number of servers, our system can correctly process any batch of lookup and write requests (with at most a polylogarithmic number of requests issued at each non-crashed server) in at most a polylogarithmic number of communication rounds, with at most polylogarithmic time and work at each server and only a logarithmic storage overhead. Our system is based on previous work by Eikel and Scheideler (SPAA 2013), who presented IRIS, a distributed information system that is provably robust against the same kind of crash failures. However, IRIS is only able to serve lookup requests. Handling both lookup and write requests has turned out to require major changes in the design of IRIS.Comment: Revised full versio

    Identifying Node Falls In Mobile Wireless Networks: A Probabilistic Method

    Get PDF
    We adopt a probabilistic strategy and propose two node failure discovery plots that deliberately join restricted monitoring, area estimation and node  joint effort. Broad reproduction brings about both associated and detached network s exhibit that our plans accomplish high failure identification rates (near an upper bound) and low false positive rates, and bring about low correspondence overhead. Contrasted with approaches that utilization incorporated checking, our approach has up to 80% lower correspondence overhead, and just marginally bring down location rates and somewhat higher false positive rates. Moreover, our approach has the favorable position that it is relevant to both associated and disengaged network s while brought together checking is just material to associated networks

    The eventual leadership in dynamic mobile networking environments

    Get PDF
    2007-2008 > Academic research: refereed > Refereed conference paperVersion of RecordPublishe

    Node Failure Detection And Fault Management In Mobile Wireless Networks With Persistent Connectivity

    Get PDF
    We adopt a probabilistic strategy and propose two hub disappointment recognition plots that deliberately consolidate limited observing, area estimation and hub cooperation. Broad reproduction brings about both associated and disengaged systems show that our plans accomplish high disappointment identification rates (near an upper bound) and low false positive rates, and cause low correspondence overhead. Contrasted with methodologies that utilization concentrated checking, our approach has up to 80% lower correspondence overhead, and just somewhat bring down recognition rates and marginally higher false positive rates. Also, our approach has the favorable position that it is relevant to both associated and detached systems while brought together observing is just material to associated systems

    Failure Detectors for Wireless Sensor-Actuator Systems

    Get PDF
    Wireless sensor-actuator systems (WSAS) offer exciting opportunities for emerging applications by facilitating fine-grained monitoring and control, and dense instrumentation. The large scale of such systems increases the need for such systems to tolerate and cope with failures, in a localized and decentralized manner. We present abstractions for detecting node failures and link failures caused by topology changes in a WSAS. These abstractions were designed and implemented as a set of reusable components in nesC under TinyOS. Results, which demonstrate the performance and viability of the abstractions, based on experiments on an 80 node testbed are presented. In the future, these abstractions can be extended to detect and cope with larger classes of failures in WSAS

    Automated Application-level Checkpointing of MPI Programs

    Full text link
    Because of increasing hardware and software complexity, the running time of many computational science applications is now more than the mean-time-to-failure of high-performance computing platforms. Therefore, computational science applications need to tolerate hardware failures. In this paper, we focus on the stopping failure model in which a faulty process hangs and stops responding to the rest of the system. We argue that tolerating such faults is best done by an approach called application-level coordinated non-blocking checkpointing, and that existing fault-tolerance protocols in teh literature are not suitable for implementing this approach. In this paper, we present a suitable protocol, and show how it can be used with a precompiler that instruments C/MPI programs to save application and MPI library state. An advantage of our approach is that it is independent of the MPI implementation. We present experimental results that argue that the overhead of using our system can be small

    Right On Time Distributed Shared Memory

    Get PDF
    The demand for real-time data storage in distributed control systems (DCSs) is growing. Yet, providing real- time DCS guarantees is challenging, especially when more and more sensor and actuator devices are connected to industrial plants and message loss needs to be taken into account. In this paper, we investigate how to build a shared memory abstraction for DCSs as a first step towards implementing different shared storage systems in a DCS context. We first prove that, in the presence of host crashes and message losses, the necessary guarantees of such an abstraction are impossible to implement using a traditional approach that has no access to the internals of existing DCS services, e.g., a modular approach where algorithms are built on top of existing software blocks like failure detectors. We propose a white-box approach that utilizes messages of existing services in any DCS as the sole means of communication. More precisely, we present TapeWorm, an algorithm that attaches itself to the heartbeat messages of the failure detector component in DCSs. We prove that TapeWorm implements the desired shared memory guarantees for applications running on a DCS. We also analyze the performance of TapeWorm and we showcase ways of adapting TapeWorm to various application needs and workloads

    Toward Reliable and Efficient Message Passing Software for HPC Systems: Fault Tolerance and Vector Extension

    Get PDF
    As the scale of High-performance Computing (HPC) systems continues to grow, researchers are devoted themselves to achieve the best performance of running long computing jobs on these systems. My research focus on reliability and efficiency study for HPC software. First, as systems become larger, mean-time-to-failure (MTTF) of these HPC systems is negatively impacted and tends to decrease. Handling system failures becomes a prime challenge. My research aims to present a general design and implementation of an efficient runtime-level failure detection and propagation strategy targeting large-scale, dynamic systems that is able to detect both node and process failures. Using multiple overlapping topologies to optimize the detection and propagation, minimizing the incurred overhead sand guaranteeing the scalability of the entire framework. Results from different machines and benchmarks compared to related works shows that my design and implementation outperforms non-HPC solutions significantly, and is competitive with specialized HPC solutions that can manage only MPI applications. Second, I endeavor to implore instruction level parallelization to achieve optimal performance. Novel processors support long vector extensions, which enables researchers to exploit the potential peak performance of target architectures. Intel introduced Advanced Vector Extension (AVX512 and AVX2) instructions for x86 Instruction Set Architecture (ISA). Arm introduced Scalable Vector Extension (SVE) with a new set of A64 instructions. Both enable greater parallelisms. My research utilizes long vector reduction instructions to improve the performance of MPI reduction operations. Also, I use gather and scatter feature to speed up the packing and unpacking operation in MPI. The evaluation of the resulting software stack under different scenarios demonstrates that the approach is not only efficient but also generalizable to many vector architecture and efficient
    corecore