149,363 research outputs found

    Generic design of Chinese remaindering schemes

    Get PDF
    We propose a generic design for Chinese remainder algorithms. A Chinese remainder computation consists in reconstructing an integer value from its residues modulo non coprime integers. We also propose an efficient linear data structure, a radix ladder, for the intermediate storage and computations. Our design is structured into three main modules: a black box residue computation in charge of computing each residue; a Chinese remaindering controller in charge of launching the computation and of the termination decision; an integer builder in charge of the reconstruction computation. We then show that this design enables many different forms of Chinese remaindering (e.g. deterministic, early terminated, distributed, etc.), easy comparisons between these forms and e.g. user-transparent parallelism at different parallel grains

    Parallel Deferred Update Replication

    Full text link
    Deferred update replication (DUR) is an established approach to implementing highly efficient and available storage. While the throughput of read-only transactions scales linearly with the number of deployed replicas in DUR, the throughput of update transactions experiences limited improvements as replicas are added. This paper presents Parallel Deferred Update Replication (P-DUR), a variation of classical DUR that scales both read-only and update transactions with the number of cores available in a replica. In addition to introducing the new approach, we describe its full implementation and compare its performance to classical DUR and to Berkeley DB, a well-known standalone database

    Using consistent subcuts for detecting stable properties

    Get PDF
    We present a general protocol for detecting whether a property holds in a distributed system, where the property is a member of a subclass of stable properties we call the locally stable properties. Our protocol is based on a decentralized method for constructing a maximal subset of the local states that are mutually consistent, which in turn is based on a weakened version of vectored time stamps. The structure of our protocol lends itself to refinement, and we demonstrate its utility by deriving some specialized property-detection protocols, including two previously known protocols that are known to be effective

    A survey of parallel execution strategies for transitive closure and logic programs

    Get PDF
    An important feature of database technology of the nineties is the use of parallelism for speeding up the execution of complex queries. This technology is being tested in several experimental database architectures and a few commercial systems for conventional select-project-join queries. In particular, hash-based fragmentation is used to distribute data to disks under the control of different processors in order to perform selections and joins in parallel. With the development of new query languages, and in particular with the definition of transitive closure queries and of more general logic programming queries, the new dimension of recursion has been added to query processing. Recursive queries are complex; at the same time, their regular structure is particularly suited for parallel execution, and parallelism may give a high efficiency gain. We survey the approaches to parallel execution of recursive queries that have been presented in the recent literature. We observe that research on parallel execution of recursive queries is separated into two distinct subareas, one focused on the transitive closure of Relational Algebra expressions, the other one focused on optimization of more general Datalog queries. Though the subareas seem radically different because of the approach and formalism used, they have many common features. This is not surprising, because most typical Datalog queries can be solved by means of the transitive closure of simple algebraic expressions. We first analyze the relationship between the transitive closure of expressions in Relational Algebra and Datalog programs. We then review sequential methods for evaluating transitive closure, distinguishing iterative and direct methods. We address the parallelization of these methods, by discussing various forms of parallelization. Data fragmentation plays an important role in obtaining parallel execution; we describe hash-based and semantic fragmentation. Finally, we consider Datalog queries, and present general methods for parallel rule execution; we recognize the similarities between these methods and the methods reviewed previously, when the former are applied to linear Datalog queries. We also provide a quantitative analysis that shows the impact of the initial data distribution on the performance of methods

    Parallelizing RRT on large-scale distributed-memory architectures

    Get PDF
    This paper addresses the problem of parallelizing the Rapidly-exploring Random Tree (RRT) algorithm on large-scale distributed-memory architectures, using the Message Passing Interface. We compare three parallel versions of RRT based on classical parallelization schemes. We evaluate them on different motion planning problems and analyze the various factors influencing their performance

    A decomposition procedure based on approximate newton directions

    Get PDF
    The efficient solution of large-scale linear and nonlinear optimization problems may require exploiting any special structure in them in an efficient manner. We describe and analyze some cases in which this special structure can be used with very little cost to obtain search directions from decomposed subproblems. We also study how to correct these directions using (decomposable) preconditioned conjugate gradient methods to ensure local convergence in all cases. The choice of appropriate preconditioners results in a natural manner from the structure in the problem. Finally, we conduct computational experiments to compare the resulting procedures with direct methods, as well as to study the impact of different preconditioner choices

    A System for Distributed Mechanisms: Design, Implementation and Applications

    Full text link
    We describe here a structured system for distributed mechanism design appropriate for both Intranet and Internet applications. In our approach the players dynamically form a network in which they know neither their neighbours nor the size of the network and interact to jointly take decisions. The only assumption concerning the underlying communication layer is that for each pair of processes there is a path of neighbours connecting them. This allows us to deal with arbitrary network topologies. We also discuss the implementation of this system which consists of a sequence of layers. The lower layers deal with the operations that implement the basic primitives of distributed computing, namely low level communication and distributed termination, while the upper layers use these primitives to implement high level communication among players, including broadcasting and multicasting, and distributed decision making. This yields a highly flexible distributed system whose specific applications are realized as instances of its top layer. This design is implemented in Java. The system supports at various levels fault-tolerance and includes a provision for distributed policing the purpose of which is to exclude `dishonest' players. Also, it can be used for repeated creation of dynamically formed networks of players interested in a joint decision making implemented by means of a tax-based mechanism. We illustrate its flexibility by discussing a number of implemented examples.Comment: 36 pages; revised and expanded versio

    A Reliable and Cost-Efficient Auto-Scaling System for Web Applications Using Heterogeneous Spot Instances

    Full text link
    Cloud providers sell their idle capacity on markets through an auction-like mechanism to increase their return on investment. The instances sold in this way are called spot instances. In spite that spot instances are usually 90% cheaper than on-demand instances, they can be terminated by provider when their bidding prices are lower than market prices. Thus, they are largely used to provision fault-tolerant applications only. In this paper, we explore how to utilize spot instances to provision web applications, which are usually considered availability-critical. The idea is to take advantage of differences in price among various types of spot instances to reach both high availability and significant cost saving. We first propose a fault-tolerant model for web applications provisioned by spot instances. Based on that, we devise novel auto-scaling polices for hourly billed cloud markets. We implemented the proposed model and policies both on a simulation testbed for repeatable validation and Amazon EC2. The experiments on the simulation testbed and the real platform against the benchmarks show that the proposed approach can greatly reduce resource cost and still achieve satisfactory Quality of Service (QoS) in terms of response time and availability

    Randomized protocols for asynchronous consensus

    Full text link
    The famous Fischer, Lynch, and Paterson impossibility proof shows that it is impossible to solve the consensus problem in a natural model of an asynchronous distributed system if even a single process can fail. Since its publication, two decades of work on fault-tolerant asynchronous consensus algorithms have evaded this impossibility result by using extended models that provide (a) randomization, (b) additional timing assumptions, (c) failure detectors, or (d) stronger synchronization mechanisms than are available in the basic model. Concentrating on the first of these approaches, we illustrate the history and structure of randomized asynchronous consensus protocols by giving detailed descriptions of several such protocols.Comment: 29 pages; survey paper written for PODC 20th anniversary issue of Distributed Computin

    Shortest, Fastest, and Foremost Broadcast in Dynamic Networks

    Full text link
    Highly dynamic networks rarely offer end-to-end connectivity at a given time. Yet, connectivity in these networks can be established over time and space, based on temporal analogues of multi-hop paths (also called {\em journeys}). Attempting to optimize the selection of the journeys in these networks naturally leads to the study of three cases: shortest (minimum hop), fastest (minimum duration), and foremost (earliest arrival) journeys. Efficient centralized algorithms exists to compute all cases, when the full knowledge of the network evolution is given. In this paper, we study the {\em distributed} counterparts of these problems, i.e. shortest, fastest, and foremost broadcast with termination detection (TDB), with minimal knowledge on the topology. We show that the feasibility of each of these problems requires distinct features on the evolution, through identifying three classes of dynamic graphs wherein the problems become gradually feasible: graphs in which the re-appearance of edges is {\em recurrent} (class R), {\em bounded-recurrent} (B), or {\em periodic} (P), together with specific knowledge that are respectively nn (the number of nodes), Δ\Delta (a bound on the recurrence time), and pp (the period). In these classes it is not required that all pairs of nodes get in contact -- only that the overall {\em footprint} of the graph is connected over time. Our results, together with the strict inclusion between PP, BB, and RR, implies a feasibility order among the three variants of the problem, i.e. TDB[foremost] requires weaker assumptions on the topology dynamics than TDB[shortest], which itself requires less than TDB[fastest]. Reversely, these differences in feasibility imply that the computational powers of RnR_n, BΔB_\Delta, and PpP_p also form a strict hierarchy
    corecore