19,649 research outputs found
Optimistic Concurrency Control for Distributed Unsupervised Learning
Research on distributed machine learning algorithms has focused primarily on
one of two extremes - algorithms that obey strict concurrency constraints or
algorithms that obey few or no such constraints. We consider an intermediate
alternative in which algorithms optimistically assume that conflicts are
unlikely and if conflicts do arise a conflict-resolution protocol is invoked.
We view this "optimistic concurrency control" paradigm as particularly
appropriate for large-scale machine learning algorithms, particularly in the
unsupervised setting. We demonstrate our approach in three problem areas:
clustering, feature learning and online facility location. We evaluate our
methods via large-scale experiments in a cluster computing environment.Comment: 25 pages, 5 figure
Synchronous Counting and Computational Algorithm Design
Consider a complete communication network on nodes, each of which is a
state machine. In synchronous 2-counting, the nodes receive a common clock
pulse and they have to agree on which pulses are "odd" and which are "even". We
require that the solution is self-stabilising (reaching the correct operation
from any initial state) and it tolerates Byzantine failures (nodes that
send arbitrary misinformation). Prior algorithms are expensive to implement in
hardware: they require a source of random bits or a large number of states.
This work consists of two parts. In the first part, we use computational
techniques (often known as synthesis) to construct very compact deterministic
algorithms for the first non-trivial case of . While no algorithm exists
for , we show that as few as 3 states per node are sufficient for all
values . Moreover, the problem cannot be solved with only 2 states per
node for , but there is a 2-state solution for all values .
In the second part, we develop and compare two different approaches for
synthesising synchronous counting algorithms. Both approaches are based on
casting the synthesis problem as a propositional satisfiability (SAT) problem
and employing modern SAT-solvers. The difference lies in how to solve the SAT
problem: either in a direct fashion, or incrementally within a counter-example
guided abstraction refinement loop. Empirical results suggest that the former
technique is more efficient if we want to synthesise time-optimal algorithms,
while the latter technique discovers non-optimal algorithms more quickly.Comment: 35 pages, extended and revised versio
Study of meta-analysis strategies for network inference using information-theoretic approaches
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Reverse engineering of gene regulatory networks (GRNs) from gene expression data is a classical challenge in systems biology. Thanks to high-throughput technologies, a massive amount of gene-expression data has been accumulated in the public repositories. Modelling GRNs from multiple experiments (also called integrative analysis) has; therefore, naturally become a standard procedure in modern computational biology. Indeed, such analysis is usually more robust than the traditional approaches focused on individual datasets, which typically suffer from some experimental bias and a small number of samples.
To date, there are mainly two strategies for the problem of interest: the first one (”data merging”) merges all datasets together and then infers a GRN whereas the other (”networks ensemble”) infers GRNs from every dataset separately and then aggregates them using some ensemble rules (such as ranksum or weightsum). Unfortunately, a thorough comparison of these two approaches is lacking.
In this paper, we evaluate the performances of various metaanalysis approaches mentioned above with a systematic set of experiments based on in silico benchmarks. Furthermore, we present a new meta-analysis approach for inferring GRNs from multiple studies. Our proposed approach, adapted to methods based on pairwise measures such as correlation or mutual information, consists of two steps: aggregating matrices of the pairwise measures from every dataset followed by extracting the network from the meta-matrix.Peer ReviewedPostprint (author's final draft
Lock-Free and Practical Deques using Single-Word Compare-And-Swap
We present an efficient and practical lock-free implementation of a
concurrent deque that is disjoint-parallel accessible and uses atomic
primitives which are available in modern computer systems. Previously known
lock-free algorithms of deques are either based on non-available atomic
synchronization primitives, only implement a subset of the functionality, or
are not designed for disjoint accesses. Our algorithm is based on a doubly
linked list, and only requires single-word compare-and-swap atomic primitives,
even for dynamic memory sizes. We have performed an empirical study using full
implementations of the most efficient algorithms of lock-free deques known. For
systems with low concurrency, the algorithm by Michael shows the best
performance. However, as our algorithm is designed for disjoint accesses, it
performs significantly better on systems with high concurrency and non-uniform
memory architecture
Parallel performance results for the OpenMOC neutron transport code on multicore platforms
The shift toward multicore architectures has ushered in a new era of shared memory parallelism for scientific applications. This transition has introduced challenges for the nuclear engineering community, as it seeks to design high-fidelity full-core reactor physics simulation tools. This article describes the parallel transport sweep algorithm in the OpenMOC method of characteristics (MOC) neutron transport code for multicore platforms using OpenMP. Strong and weak scaling studies are performed for both Intel Xeon and IBM Blue Gene/Q (BG/Q) multicore processors. The results demonstrate 100% parallel efficiency for 12 threads on 12 cores on Intel Xeon platforms and over 90% parallel efficiency with 64 threads on 16 cores on the IBM BG/Q. These results illustrate the potential for hardware acceleration for MOC neutron transport on modern multicore and future many-core architectures. In addition, this work highlights the pitfalls of programming for multicore architectures, with a focal point on false sharing.National Science Foundation (U.S.). Graduate Research Fellowship Program (Grant 1122374)United States. Department of Energy (Center for Exascale Simulation of Advanced Reactors. Contract DE-AC02-06CH11357
- …