19,649 research outputs found

    Optimistic Concurrency Control for Distributed Unsupervised Learning

    Get PDF
    Research on distributed machine learning algorithms has focused primarily on one of two extremes - algorithms that obey strict concurrency constraints or algorithms that obey few or no such constraints. We consider an intermediate alternative in which algorithms optimistically assume that conflicts are unlikely and if conflicts do arise a conflict-resolution protocol is invoked. We view this "optimistic concurrency control" paradigm as particularly appropriate for large-scale machine learning algorithms, particularly in the unsupervised setting. We demonstrate our approach in three problem areas: clustering, feature learning and online facility location. We evaluate our methods via large-scale experiments in a cluster computing environment.Comment: 25 pages, 5 figure

    Synchronous Counting and Computational Algorithm Design

    Full text link
    Consider a complete communication network on nn nodes, each of which is a state machine. In synchronous 2-counting, the nodes receive a common clock pulse and they have to agree on which pulses are "odd" and which are "even". We require that the solution is self-stabilising (reaching the correct operation from any initial state) and it tolerates ff Byzantine failures (nodes that send arbitrary misinformation). Prior algorithms are expensive to implement in hardware: they require a source of random bits or a large number of states. This work consists of two parts. In the first part, we use computational techniques (often known as synthesis) to construct very compact deterministic algorithms for the first non-trivial case of f=1f = 1. While no algorithm exists for n<4n < 4, we show that as few as 3 states per node are sufficient for all values n≥4n \ge 4. Moreover, the problem cannot be solved with only 2 states per node for n=4n = 4, but there is a 2-state solution for all values n≥6n \ge 6. In the second part, we develop and compare two different approaches for synthesising synchronous counting algorithms. Both approaches are based on casting the synthesis problem as a propositional satisfiability (SAT) problem and employing modern SAT-solvers. The difference lies in how to solve the SAT problem: either in a direct fashion, or incrementally within a counter-example guided abstraction refinement loop. Empirical results suggest that the former technique is more efficient if we want to synthesise time-optimal algorithms, while the latter technique discovers non-optimal algorithms more quickly.Comment: 35 pages, extended and revised versio

    Study of meta-analysis strategies for network inference using information-theoretic approaches

    Get PDF
    © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Reverse engineering of gene regulatory networks (GRNs) from gene expression data is a classical challenge in systems biology. Thanks to high-throughput technologies, a massive amount of gene-expression data has been accumulated in the public repositories. Modelling GRNs from multiple experiments (also called integrative analysis) has; therefore, naturally become a standard procedure in modern computational biology. Indeed, such analysis is usually more robust than the traditional approaches focused on individual datasets, which typically suffer from some experimental bias and a small number of samples. To date, there are mainly two strategies for the problem of interest: the first one (”data merging”) merges all datasets together and then infers a GRN whereas the other (”networks ensemble”) infers GRNs from every dataset separately and then aggregates them using some ensemble rules (such as ranksum or weightsum). Unfortunately, a thorough comparison of these two approaches is lacking. In this paper, we evaluate the performances of various metaanalysis approaches mentioned above with a systematic set of experiments based on in silico benchmarks. Furthermore, we present a new meta-analysis approach for inferring GRNs from multiple studies. Our proposed approach, adapted to methods based on pairwise measures such as correlation or mutual information, consists of two steps: aggregating matrices of the pairwise measures from every dataset followed by extracting the network from the meta-matrix.Peer ReviewedPostprint (author's final draft

    Lock-Free and Practical Deques using Single-Word Compare-And-Swap

    Full text link
    We present an efficient and practical lock-free implementation of a concurrent deque that is disjoint-parallel accessible and uses atomic primitives which are available in modern computer systems. Previously known lock-free algorithms of deques are either based on non-available atomic synchronization primitives, only implement a subset of the functionality, or are not designed for disjoint accesses. Our algorithm is based on a doubly linked list, and only requires single-word compare-and-swap atomic primitives, even for dynamic memory sizes. We have performed an empirical study using full implementations of the most efficient algorithms of lock-free deques known. For systems with low concurrency, the algorithm by Michael shows the best performance. However, as our algorithm is designed for disjoint accesses, it performs significantly better on systems with high concurrency and non-uniform memory architecture

    Parallel performance results for the OpenMOC neutron transport code on multicore platforms

    Get PDF
    The shift toward multicore architectures has ushered in a new era of shared memory parallelism for scientific applications. This transition has introduced challenges for the nuclear engineering community, as it seeks to design high-fidelity full-core reactor physics simulation tools. This article describes the parallel transport sweep algorithm in the OpenMOC method of characteristics (MOC) neutron transport code for multicore platforms using OpenMP. Strong and weak scaling studies are performed for both Intel Xeon and IBM Blue Gene/Q (BG/Q) multicore processors. The results demonstrate 100% parallel efficiency for 12 threads on 12 cores on Intel Xeon platforms and over 90% parallel efficiency with 64 threads on 16 cores on the IBM BG/Q. These results illustrate the potential for hardware acceleration for MOC neutron transport on modern multicore and future many-core architectures. In addition, this work highlights the pitfalls of programming for multicore architectures, with a focal point on false sharing.National Science Foundation (U.S.). Graduate Research Fellowship Program (Grant 1122374)United States. Department of Energy (Center for Exascale Simulation of Advanced Reactors. Contract DE-AC02-06CH11357
    • …
    corecore