3 research outputs found

    Distributed Multi-writer Multi-reader Atomic Register with Optimistically Fast Read and Write

    Full text link
    A distributed multi-writer multi-reader (MWMR) atomic register is an important primitive that enables a wide range of distributed algorithms. Hence, improving its performance can have large-scale consequences. Since the seminal work of ABD emulation in the message-passing networks [JACM '95], many researchers study fast implementations of atomic registers under various conditions. "Fast" means that a read or a write can be completed with 1 round-trip time (RTT), by contacting a simple majority. In this work, we explore an atomic register with optimal resilience and "optimistically fast" read and write operations. That is, both operations can be fast if there is no concurrent write. This paper has three contributions: (i) We present Gus, the emulation of an MWMR atomic register with optimal resilience and optimistically fast reads and writes when there are up to 5 nodes; (ii) We show that when there are > 5 nodes, it is impossible to emulate an MWMR atomic register with both properties; and (iii) We implement Gus in the framework of EPaxos and Gryff, and show that Gus provides lower tail latency than state-of-the-art systems such as EPaxos, Gryff, Giza, and Tempo under various workloads in the context of geo-replicated object storage systems

    ENHANCING CLOUD SYSTEM RUNTIME TO ADDRESS COMPLEX FAILURES

    Get PDF
    As the reliance on cloud systems intensifies in our progressively digital world, understanding and reinforcing their reliability becomes more crucial than ever. Despite impressive advancements in augmenting the resilience of cloud systems, the growing incidence of complex failures now poses a substantial challenge to the availability of these systems. With cloud systems continuing to scale and increase in complexity, failures not only become more elusive to detect but can also lead to more catastrophic consequences. Such failures question the foundational premises of conventional fault-tolerance designs, necessitating the creation of novel system designs to counteract them. This dissertation aims to enhance distributed systems’ capabilities to detect, localize, and react to complex failures at runtime. To this end, this dissertation makes contributions to address three emerging categories of failures in cloud systems. The first part delves into the investigation of partial failures, introducing OmegaGen, a tool adept at generating tailored checkers for detecting and localizing such failures. The second part grapples with silent semantic failures prevalent in cloud systems, showcasing our study findings, and introducing Oathkeeper, a tool that leverages past failures to infer rules and expose these silent issues. The third part explores solutions to slow failures via RESIN, a framework specifically designed to detect, diagnose, and mitigate memory leaks in cloud-scale infrastructures, developed in collaboration with Microsoft Azure. The dissertation concludes by offering insights into future directions for the construction of reliable cloud systems

    Second Generation General System Theory: Perspectives in Philosophy and Approaches in Complex Systems

    Get PDF
    Following the classical work of Norbert Wiener, Ross Ashby, Ludwig von Bertalanffy and many others, the concept of System has been elaborated in different disciplinary fields, allowing interdisciplinary approaches in areas such as Physics, Biology, Chemistry, Cognitive Science, Economics, Engineering, Social Sciences, Mathematics, Medicine, Artificial Intelligence, and Philosophy. The new challenge of Complexity and Emergence has made the concept of System even more relevant to the study of problems with high contextuality. This Special Issue focuses on the nature of new problems arising from the study and modelling of complexity, their eventual common aspects, properties and approaches—already partially considered by different disciplines—as well as focusing on new, possibly unitary, theoretical frameworks. This Special Issue aims to introduce fresh impetus into systems research when the possible detection and correction of mistakes require the development of new knowledge. This book contains contributions presenting new approaches and results, problems and proposals. The context is an interdisciplinary framework dealing, in order, with electronic engineering problems; the problem of the observer; transdisciplinarity; problems of organised complexity; theoretical incompleteness; design of digital systems in a user-centred way; reaction networks as a framework for systems modelling; emergence of a stable system in reaction networks; emergence at the fundamental systems level; behavioural realization of memoryless functions
    corecore