Search CORE

13,226 research outputs found

Maintaining consistency in distributed systems

Author: Birman Kenneth P.
Publication venue
Publication date: 01/01/1991
Field of study

In systems designed as assemblies of independently developed components, concurrent access to data or data structures normally arises within individual programs, and is controlled using mutual exclusion constructs, such as semaphores and monitors. Where data is persistent and/or sets of operation are related to one another, transactions or linearizability may be more appropriate. Systems that incorporate cooperative styles of distributed execution often replicate or distribute data within groups of components. In these cases, group oriented consistency properties must be maintained, and tools based on the virtual synchrony execution model greatly simplify the task confronting an application developer. All three styles of distributed computing are likely to be seen in future systems - often, within the same application. This leads us to propose an integrated approach that permits applications that use virtual synchrony with concurrent objects that respect a linearizability constraint, and vice versa. Transactional subsystems are treated as a special case of linearizability

CiteSeerX

NASA Technical Reports Server

eCommons@Cornell

A Theory of Partitioned Global Address Spaces

Author: Calin Georgel
Derevenetc Egor
Majumdar Rupak
Meyer Roland
Publication venue
Publication date: 01/01/2013
Field of study

Partitioned global address space (PGAS) is a parallel programming model for the development of applications on clusters. It provides a global address space partitioned among the cluster nodes, and is supported in programming languages like C, C++, and Fortran by means of APIs. In this paper we provide a formal model for the semantics of single instruction, multiple data programs using PGAS APIs. Our model reflects the main features of popular real-world APIs such as SHMEM, ARMCI, GASNet, GPI, and GASPI. A key feature of PGAS is the support for one-sided communication: a node may directly read and write the memory located at a remote node, without explicit synchronization with the processes running on the remote side. One-sided communication increases performance by decoupling process synchronization from data transfer, but requires the programmer to reason about appropriate synchronizations between reads and writes. As a second contribution, we propose and investigate robustness, a criterion for correct synchronization of PGAS programs. Robustness corresponds to acyclicity of a suitable happens-before relation defined on PGAS computations. The requirement is finer than the classical data race freedom and rules out most false error reports. Our main result is an algorithm for checking robustness of PGAS programs. The algorithm makes use of two insights. Using combinatorial arguments we first show that, if a PGAS program is not robust, then there are computations in a certain normal form that violate happens-before acyclicity. Intuitively, normal-form computations delay remote accesses in an ordered way. We then devise an algorithm that checks for cyclic normal-form computations. Essentially, the algorithm is an emptiness check for a novel automaton model that accepts normal-form computations in streaming fashion. Altogether, we prove the robustness problem is PSpace-complete

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit

Fraunhofer-ePrints

Dagstuhl Research Online Publication Server

MPG.PuRe

Monitoring Partially Synchronous Distributed Systems using SMT Solvers

Author: A Bauer
B Charron-Bost
CM Chase
D Basin
DL Mills
K Marzullo
L Moura de
R Schwarz
S Yingchareonthawornchai
SD Stoller
SS Kulkarni
VK Garg
W Zhu
Y Falcone
Publication venue
Publication date: 24/07/2017
Field of study

In this paper, we discuss the feasibility of monitoring partially synchronous distributed systems to detect latent bugs, i.e., errors caused by concurrency and race conditions among concurrent processes. We present a monitoring framework where we model both system constraints and latent bugs as Satisfiability Modulo Theories (SMT) formulas, and we detect the presence of latent bugs using an SMT solver. We demonstrate the feasibility of our framework using both synthetic applications where latent bugs occur at any time with random probability and an application involving exclusive access to a shared resource with a subtle timing bug. We illustrate how the time required for verification is affected by parameters such as communication frequency, latency, and clock skew. Our results show that our framework can be used for real-life applications, and because our framework uses SMT solvers, the range of appropriate applications will increase as these solvers become more efficient over time.Comment: Technical Report corresponding to the paper accepted at Runtime Verification (RV) 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Computing infrastructure issues in distributed communications systems : a survey of operating system transport system architectures

Author: Schmidt Douglas C.
Suda Tatsuya
Publication venue: eScholarship, University of California
Publication date: 01/01/1992
Field of study

The performance of distributed applications (such as file transfer, remote login, tele-conferencing, full-motion video, and scientific visualization) is influenced by several factors that interact in complex ways. In particular, application performance is significantly affected both by communication infrastructure factors and computing infrastructure factors. Several communication infrastructure factors include channel speed, bit-error rate, and congestion at intermediate switching nodes. Computing infrastructure factors include (among other things) both protocol processing activities (such as connection management, flow control, error detection, and retransmission) and general operating system factors (such as memory latency, CPU speed, interrupt and context switching overhead, process architecture, and message buffering). Due to a several orders of magnitude increase in network channel speed and an increase in application diversity, performance bottlenecks are shifting from the network factors to the transport system factors.This paper defines an abstraction called an "Operating System Transport System Architecture" (OSTSA) that is used to classify the major components and services in the computing infrastructure. End-to-end network protocols such as TCP, TP4, VMTP, XTP, and Delta-t typically run on general-purpose computers, where they utilize various operating system resources such as processors, virtual memory, and network controllers. The OSTSA provides services that integrate these resources to support distributed applications running on local and wide area networks.A taxonomy is presented to evaluate OSTSAs in terms of their support for protocol processing activities. We use this taxonomy to compare and contrast five general-purpose commercial and experimental operating systems including System V UNIX, BSD UNIX, the x-kernel, Choices, and Xinu

eScholarship - University of California