2,348 research outputs found
Tree-based Algorithm to Find the k-th Value in Distributed Systems
In this paper, we study distributed algorithms for finding the k-th value in the decentralized systems. First we consider the case of circular configuration of processors where no processor knows the total number of participants. Later a network of arbitrary configuration is examined and a tree-based algorithm is proposed. The proposed algorithm requires O(N) messages and O(log N) rounds of message passing, where N is the number of nodes in the network
How to Elect a Leader Faster than a Tournament
The problem of electing a leader from among contenders is one of the
fundamental questions in distributed computing. In its simplest formulation,
the task is as follows: given processors, all participants must eventually
return a win or lose indication, such that a single contender may win. Despite
a considerable amount of work on leader election, the following question is
still open: can we elect a leader in an asynchronous fault-prone system faster
than just running a -time tournament, against a strong adaptive
adversary?
In this paper, we answer this question in the affirmative, improving on a
decades-old upper bound. We introduce two new algorithmic ideas to reduce the
time complexity of electing a leader to , using
point-to-point messages. A non-trivial application of our algorithm is a new
upper bound for the tight renaming problem, assigning items to the
participants in expected time and messages. We
complement our results with lower bound of messages for solving
these two problems, closing the question of their message complexity
Fast and Compact Distributed Verification and Self-Stabilization of a DFS Tree
We present algorithms for distributed verification and silent-stabilization
of a DFS(Depth First Search) spanning tree of a connected network. Computing
and maintaining such a DFS tree is an important task, e.g., for constructing
efficient routing schemes. Our algorithm improves upon previous work in various
ways. Comparable previous work has space and time complexities of bits per node and respectively, where is the highest
degree of a node, is the number of nodes and is the diameter of the
network. In contrast, our algorithm has a space complexity of bits
per node, which is optimal for silent-stabilizing spanning trees and runs in
time. In addition, our solution is modular since it utilizes the
distributed verification algorithm as an independent subtask of the overall
solution. It is possible to use the verification algorithm as a stand alone
task or as a subtask in another algorithm. To demonstrate the simplicity of
constructing efficient DFS algorithms using the modular approach, We also
present a (non-sielnt) self-stabilizing DFS token circulation algorithm for
general networks based on our silent-stabilizing DFS tree. The complexities of
this token circulation algorithm are comparable to the known ones
Using Flow Specifications of Parameterized Cache Coherence Protocols for Verifying Deadlock Freedom
We consider the problem of verifying deadlock freedom for symmetric cache
coherence protocols. In particular, we focus on a specific form of deadlock
which is useful for the cache coherence protocol domain and consistent with the
internal definition of deadlock in the Murphi model checker: we refer to this
deadlock as a system- wide deadlock (s-deadlock). In s-deadlock, the entire
system gets blocked and is unable to make any transition. Cache coherence
protocols consist of N symmetric cache agents, where N is an unbounded
parameter; thus the verification of s-deadlock freedom is naturally a
parameterized verification problem. Parametrized verification techniques work
by using sound abstractions to reduce the unbounded model to a bounded model.
Efficient abstractions which work well for industrial scale protocols typically
bound the model by replacing the state of most of the agents by an abstract
environment, while keeping just one or two agents as is. However, leveraging
such efficient abstractions becomes a challenge for s-deadlock: a violation of
s-deadlock is a state in which the transitions of all of the unbounded number
of agents cannot occur and so a simple abstraction like the one above will not
preserve this violation. In this work we address this challenge by presenting a
technique which leverages high-level information about the protocols, in the
form of message sequence dia- grams referred to as flows, for constructing
invariants that are collectively stronger than s-deadlock. Efficient
abstractions can be constructed to verify these invariants. We successfully
verify the German and Flash protocols using our technique
Software Performance Engineering using Virtual Time Program Execution
In this thesis we introduce a novel approach to software performance engineering that is based
on the execution of code in virtual time. Virtual time execution models the timing-behaviour
of unmodified applications by scaling observed method times or replacing them with results
acquired from performance model simulation. This facilitates the investigation of "what-if" performance predictions of applications comprising an arbitrary combination of real code and
performance models. The ability to analyse code and models in a single framework enables
performance testing throughout the software lifecycle, without the need to to extract performance
models from code. This is accomplished by forcing thread scheduling decisions to take
into account the hypothetical time-scaling or model-based performance specifications of each
method. The virtual time execution of I/O operations or multicore targets is also investigated.
We explore these ideas using a Virtual EXecution (VEX) framework, which provides performance
predictions for multi-threaded applications. The language-independent VEX core is
driven by an instrumentation layer that notifies it of thread state changes and method profiling events; it is then up to VEX to control the progress of application threads in virtual time on top of the operating system scheduler. We also describe a Java Instrumentation Environment
(JINE), demonstrating the challenges involved in virtual time execution at the JVM level.
We evaluate the VEX/JINE tools by executing client-side Java benchmarks in virtual time
and identifying the causes of deviations from observed real times. Our results show that VEX
and JINE transparently provide predictions for the response time of unmodified applications
with typically good accuracy (within 5-10%) and low simulation overheads (25-50% additional
time). We conclude this thesis with a case study that shows how models and code can be
integrated, thus illustrating our vision on how virtual time execution can support performance
testing throughout the software lifecycle
The Problem of Mutual Exclusion: A New Distributed Solution
In both centralized and distributed systems, processes cooperate and compete with each other to access the system resources. Some of these resources must be used exclusively. It is then required that only one process access the shared resource at a given time. This is referred to as the problem of mutual exclusion. Several synchronization mechanisms have been proposed to solve this problem. In this thesis, an effort has been made to compile most of the existing mutual exclusion solutions for both shared memory and message-passing based systems. A new distributed algorithm, which uses a dynamic information structure, is presented to solve the problem of mutual exclusion. It is proved to be free from both deadlock and starvation. This solution is shown to be economical in terms of the number of message exchanges required per critical section execution. Procedures for recovery from both site and link failures are also given
Bounds for self-stabilization in unidirectional networks
A distributed algorithm is self-stabilizing if after faults and attacks hit
the system and place it in some arbitrary global state, the systems recovers
from this catastrophic situation without external intervention in finite time.
Unidirectional networks preclude many common techniques in self-stabilization
from being used, such as preserving local predicates. In this paper, we
investigate the intrinsic complexity of achieving self-stabilization in
unidirectional networks, and focus on the classical vertex coloring problem.
When deterministic solutions are considered, we prove a lower bound of
states per process (where is the network size) and a recovery time of at
least actions in total. We present a deterministic algorithm with
matching upper bounds that performs in arbitrary graphs. When probabilistic
solutions are considered, we observe that at least states per
process and a recovery time of actions in total are required (where
denotes the maximal degree of the underlying simple undirected graph).
We present a probabilistically self-stabilizing algorithm that uses
states per process, where is a parameter of the
algorithm. When , the algorithm recovers in expected
actions. When may grow arbitrarily, the algorithm
recovers in expected O(n) actions in total. Thus, our algorithm can be made
optimal with respect to space or time complexity
An occam Style Communications System for UNIX Networks
This document describes the design of a communications system which provides occam style communications primitives under a Unix environment, using TCP/IP protocols, and any number of other protocols deemed suitable as underlying transport layers. The system will integrate with a low overhead scheduler/kernel without incurring significant costs to the execution of processes within the run time environment. A survey of relevant occam and occam3 features and related research is followed by a look at the Unix and TCP/IP facilities which determine our working constraints, and a description of the T9000 transputer's Virtual Channel Processor, which was instrumental in our formulation. Drawing from the information presented here, a design for the communications system is subsequently proposed. Finally, a preliminary investigation of methods for lightweight access control to shared resources in an environment which does not provide support for critical sections, semaphores, or busy waiting, is made. This is presented with relevance to mutual exclusion problems which arise within the proposed design. Future directions for the evolution of this project are discussed in conclusion
Macroservers: An Execution Model for DRAM Processor-In-Memory Arrays
The emergence of semiconductor fabrication technology allowing a tight coupling between high-density DRAM and CMOS logic on the same chip has led to the important new class of Processor-In-Memory (PIM) architectures. Newer developments provide powerful parallel processing capabilities on the chip, exploiting the facility to load wide words in single memory accesses and supporting complex address manipulations in the memory. Furthermore, large arrays of PIMs can be arranged into a massively parallel architecture. In this report, we describe an object-based programming model based on the notion of a macroserver. Macroservers encapsulate a set of variables and methods; threads, spawned by the activation of methods, operate asynchronously on the variables' state space. Data distributions provide a mechanism for mapping large data structures across the memory region of a macroserver, while work distributions allow explicit control of bindings between threads and data. Both data and work distributuions are first-class objects of the model, supporting the dynamic management of data and threads in memory. This offers the flexibility required for fully exploiting the processing power and memory bandwidth of a PIM array, in particular for irregular and adaptive applications. Thread synchronization is based on atomic methods, condition variables, and futures. A special type of lightweight macroserver allows the formulation of flexible scheduling strategies for the access to resources, using a monitor-like mechanism
- …