1,655 research outputs found

    Formal verification of distributed deadlock detection algorithms

    Full text link
    The problem of distributed deadlock detection has undergone extensive study. Formal verification of deadlock detection algorithms in distributed systems is an area of research that has largely been ignored. Instead, most proposed distributed deadlock detection algorithms have used informal or intuitive arguments, simulation or just neglect the entire aspect of verification of correctness; As a consequence, many of these algorithms have been shown incorrect. This research will abstract the notion of deadlock in terms of a temporal logic of actions and discuss the invariant and eventuality properties. The contributions of this research are the development of a distributed deadlock detection algorithm and the formal verification of this algorithm

    Causal synchrony in the design of distributed programs

    Get PDF
    The outcome of any computation is determined by the order of the events in the computation and the state of the component variables of the computation at those events. The level of knowledge that can be obtained about event order and process state influences protocol design and operation. In a centralized system, the presence of a physical clock makes it easy to determine event order. It is a more difficult task in a distributed system because there is normally no global time. Hence, there is no common time reference to be used for ordering events. as a consequence, distributed protocols are often designed without explicit reference to event order. Instead they are based on some approximation of global state. Because global state is also difficult to identify in a distributed system, the resulting protocols are not as efficient or clear as they could be.;We subscribe to Lamport\u27s proposition that the relevant temporal ordering of any two events is determined by their causal relationship and that knowledge of the causal order can be a powerful tool in protocol design. Mattern\u27s vector time can be used to identify the causal order, thereby providing the common frame of reference needed to order events in a distributed computation. In this dissertation we present a consistent methodology for analysis and design of distributed protocols that is based on the causal order and vector time. Using it we can specify conditions which must be met for a protocol to be correct, we can define the axiomatic protocol specifications, and we can structure reasoning about the correctness of the specified protocol. Employing causality as a unifying concept clarifies protocol specifications and correctness arguments because it enables them to be defined purely in terms of local states and local events.;We have successfully applied this methodology to the problems of distributed termination detection, distributed deadlock detection and resolution, and optimistic recovery. In all cases, the causally synchronous protocols we have presented are efficient and demonstrably correct

    Design of deadlock detection and prevention algorithms in distributed systems

    Full text link
    A distributed system consists of a collection of processes which communicate with each other by exchanging messages to achieve a common goal. One of the key problems in distributed systems is the possibility of deadlock. Processes are said to be deadlocked when some processes are blocked on resource requests that can never be satisfied unless drastic systems action is taken. Two distributed deadlock detection algorithms handling multiple outstanding requests is proposed and are proven to be correct: it detects all cycles and does not detect false deadlocks. The algorithms are based on the concept of chasing the edge of the waitfor graph (probe-based). Simulation results show that the proposed algorithm performs very well compared to some existing algorithms. A deadlock prevention algorithm based on the notion of coloring the nodes of the waitfor graph is also proposed. Rollback is quite less compared to some existing algorithms

    UPC-CHECK: A scalable tool for detecting run-time errors in Unified Parallel C

    Get PDF
    Unied Parallel C (UPC) is a language used to write parallel programs for shared and distributed memory parallel computers. UPC-CHECK is a scalable tool developed to automatically detect argument errors in UPC functions and deadlocks in UPC programs at run-time and issue high quality error messages to help programmers quickly x those errors. The tool is easy to use and involves merely replacing the compiler command with upc-check. The tool uses a novel distributed algorithm for detecting argument and deadlock errors in collective operations. The run-time complexity of the algorithm has been proven to be O(1). The algorithm has been extended to detect deadlocks created involving locks with a run-time complexity of O(T), where T is the number of threads waiting to acquire a lock. Error messages issued by UPC-CHECK were evaluated using the UPC RTED test suite for argument errors in UPC functions and deadlocks. Results of these tests show that the error messages issued by UPC-CHECK for these tests are excellent. The scalability of all the algorithms used was demonstrated using performance-evaluation test programs and the UPC NAS Parallel Benchmarks

    Self-stabilizing deadlock algorithms in distributed systems

    Full text link
    A self-stabilizing system is a network of processors, which, when started from an arbitrary (and possibly illegal) initial state, always returns to a legal state in a finite number of steps. Self-stabilization is an evolving paradigm in fault-tolerant computing. This research will be the first time self-stabilization is used in the areas of deadlock detection and prevention. Traditional deadlock detection algorithms have a process initiate a probe. If that probe travels around the system and is received by the initiator, there is a cycle in the system, and deadlock is detected. In order to prevent deadlocks, algorithms usually rank nodes in order to determine if an added edge will create a deadlock in the system. In a self-stabilizing system, perturbances are automatically dealt with. For the deadlock model, the perturbances in the system are requests and releases of resources. So, the self-stabilizing deadlock detection algorithm will automatically detect a deadlock when a request causes a cycle in the wait-for graph. The self-stabilizing prevention algorithm prevents deadlocks in a similar manner. The self-stabilizing algorithms do not have to be initiated by any process because the requests and releases create a perturbance which is dealt with automatically

    Global state predicates in rough real-time

    Get PDF
    Distributed systems are characterized by the fact that the constituent processes have neither common memory nor a common system clock. These processes communicate solely via message passing. While providing a number of benefits such as increased reliability, increased computational power, and geographic dispersion, this architecture significantly complicates many of the tasks of software development and verification, including evaluation of the program state. In the case of distributed systems, the program state is comprised of the local states of the constituent processes, as well as the state of the channels between processes, and is called the global state.;With no common system clock, many distributed system protocols rely on the global ordering of local process events imposed by the message passing that occurs between processes. This leads to a partial global ordering of local process events, which can then be used to determine which process states could (or could not) have occurred simultaneously.;Traditional predicate evaluation protocols evaluate predicates on the global state of a distributed computation using consistent global states. This evaluation is complicated by the fact that the event ordering imposed by message passing is only partial. A complete history of the global states that occurred during an execution cannot always be constructed. This introduces inefficiency into predicate detection protocols and prohibits detection of certain predicates.;This dissertation explores the use of this rough global time base for global state predicate evaluation within distributed systems. By structuring the evaluation on the assumption that a global time base exists, we can develop simple and efficient protocols for both stable and unstable predicate evaluation. Further, we can evaluate certain predicates which are not easily evaluated using consistent global states. We demonstrate these advantages by developing protocols for detection of distributed termination, distributed deadlock detection, and detection of certain unstable predicates as they occur. as the global time base is rough, we can only detect unstable predicates which remain true for a sufficient duration. We additionally develop several formalizations which assist the protocol developer in dealing with the fact that the global time base is not perfect. We demonstrate the application of these formalizations within the protocols that we develop

    Self-stabilizing routing protocols

    Full text link
    In systems made up of processors and links connecting the processors, the global state of the system is defined by the local variables of the individual processors. The set of global states can be defined as being either legal or illegal. A self-stabilizing system is one that forces a system from an illegal state to a global legal state without external interference, using a finite number of steps. This thesis will concentrate on application of self-stabilization to routing problems, in particular path identification, connectivity and methods involved in destinational routing. Traditional methods for creation of rooted paths to multiple destinations in a computer network involve the creation of spanning trees, and broadcasting information on the tree to be picked up by the individual nodes on the tree. The information for the creation of the tree are all sourced at the root, and the individual nodes update information from the centralized source. The self-stabilization model for networks allows the decision for a creation of a tree and message checking to occur automatically, locally, and more important, in contrast to traditional networks, asynchronously. The creation, message passing occur with a node and its immediate neighbor, and the tree, path is created based on this communicated data. In addition, the self-stabilization model eliminates the requisite initialization of traditional networks, i.e. given any arbitrary initial state the system (a given network) is guaranteed to stabilize to a legal global state, in the case of a broadcast network, a minimal spanning tree rooted at a source
    corecore