114,756 research outputs found
Quantitative Analysis of Consensus Algorithms
Consensus is one of the key problems in fault-tolerant distributed computing. Although the solvability of consensus is now a well-understood problem, comparing different algorithms in terms of efficiency is still an open problem. In this paper, we address this question for round-based consensus algorithm using communication predicates, on top of a partial synchronous system that alternates between good and bad periods (synchronous and non synchronous periods). Communication predicates together with the detailed timing information of the underlying partial-synchronous system provide a convenient and powerful framework for comparing different consensus algorithms and their implementations. This approach allows us to quantify the required length of a good period to solve a given number of consensus instances. With our results, we can observe several interesting issues, e.g., that the number of rounds of an algorithm is not necessarily a good metric for its performance
Quantitative toxicity prediction using topology based multi-task deep neural networks
The understanding of toxicity is of paramount importance to human health and
environmental protection. Quantitative toxicity analysis has become a new
standard in the field. This work introduces element specific persistent
homology (ESPH), an algebraic topology approach, for quantitative toxicity
prediction. ESPH retains crucial chemical information during the topological
abstraction of geometric complexity and provides a representation of small
molecules that cannot be obtained by any other method. To investigate the
representability and predictive power of ESPH for small molecules, ancillary
descriptors have also been developed based on physical models. Topological and
physical descriptors are paired with advanced machine learning algorithms, such
as deep neural network (DNN), random forest (RF) and gradient boosting decision
tree (GBDT), to facilitate their applications to quantitative toxicity
predictions. A topology based multi-task strategy is proposed to take the
advantage of the availability of large data sets while dealing with small data
sets. Four benchmark toxicity data sets that involve quantitative measurements
are used to validate the proposed approaches. Extensive numerical studies
indicate that the proposed topological learning methods are able to outperform
the state-of-the-art methods in the literature for quantitative toxicity
analysis. Our online server for computing element-specific topological
descriptors (ESTDs) is available at http://weilab.math.msu.edu/TopTox/Comment: arXiv admin note: substantial text overlap with arXiv:1703.1095
An Experimental Evaluation of Foreground Detection Algorithms in Real Scenes
International audience; Foreground detection is an important preliminary step of many video analysis systems. Many algorithms have been proposed in the last years, but there is not yet a consensus on which approach is the most effective, not even limiting the problem to a single category of videos. This paper aims at constituting a first step towards a reliable assessment of the most commonly used approaches. In particular, four notable algorithms that perform foreground detection have been evaluated using quantitative measures to assess their relative merits and demerits. The evaluation has been carried out using a large, publicly available dataset composed by videos representing different realistic applicative scenarios. The obtained performance is presented and discussed, highlighting the conditions under which algorithm can represent the most effective solution
Round-Based Consensus Algorithms, Predicate Implementations and Quantitative Analysis
Fault-tolerant computing is the art and science of building computer systems that continue to operate normally in the presence of faults. The fault tolerance field covers a wide spectrum of research area ranging from computer hardware to computer software. A common approach to obtain a fault-tolerant system is using software replication. However, maintaining the state of the replicas consistent is not an easy task, even though the understanding of the problems related to replication has significantly evolved over the past thirty years. Consensus is a fundamental building block to provide consistency in any fault-tolerant distributed system. A large number of algorithms have been proposed to solve the consensus problem in different systems. The efficiency of several consensus algorithms has been studied theoretically and practically. A common metric to evaluate the performance of consensus algorithms is the number of communication steps or the number of rounds (in round-based algorithms) for deciding. A large amount of improvements to consensus algorithms have been proposed to reduce this number under different assumptions, e.g., nice runs. However, the efficiency expressed in terms of number of rounds does not predict the time it takes to decide (including the time needed by the system to stabilize or not). Following this idea, the thesis investigates the round model abstraction to represent consensus algorithms, with benign and Byzantine faults, in a concise and modular way. The goal of the thesis is first to decouple the consensus algorithm from irrelevant details of implementations, such as synchronization, then study different possible implementations for a given consensus algorithm, and finally propose a more general analytical analysis for different consensus algorithms. The first part of the thesis considers the round-based consensus algorithms with benign faults. In this context, the round model allowed us to separate the consensus algorithms from the round implementation, to propose different round implementations, to improve existing round implementations by making them swift, and to provide quantitative analysis of different algorithms. The second part of the thesis considers the round-based consensus algorithms with Byzantine faults. In this context, there is a gap between theoretical consensus algorithms and practical Byzantine fault-tolerant protocols. The round model allowed us to fill the gap by better understanding existing protocols, and enabled us to express existing protocols in a simple and modular way, to obtain simplified proofs, to discover new protocols such as decentralized (non leader-based) algorithms, and finally to perform precise timing analysis to compare different algorithms. The last part of the thesis shows, as an example, how a round-based consensus algorithm that tolerates benign faults can be extended to wireless mobile ad hoc networks using an adequate communication layer. We have validated our implementation by running simulations in single hop and multi-hop wireless networks
Modeling with the Crowd: Optimizing the Human-Machine Partnership with Zooniverse
LSST and Euclid must address the daunting challenge of analyzing the
unprecedented volumes of imaging and spectroscopic data that these
next-generation instruments will generate. A promising approach to overcoming
this challenge involves rapid, automatic image processing using appropriately
trained Deep Learning (DL) algorithms. However, reliable application of DL
requires large, accurately labeled samples of training data. Galaxy Zoo Express
(GZX) is a recent experiment that simulated using Bayesian inference to
dynamically aggregate binary responses provided by citizen scientists via the
Zooniverse crowd-sourcing platform in real time. The GZX approach enables
collaboration between human and machine classifiers and provides rapidly
generated, reliably labeled datasets, thereby enabling online training of
accurate machine classifiers. We present selected results from GZX and show how
the Bayesian aggregation engine it uses can be extended to efficiently provide
object-localization and bounding-box annotations of two-dimensional data with
quantified reliability. DL algorithms that are trained using these annotations
will facilitate numerous panchromatic data modeling tasks including
morphological classification and substructure detection in direct imaging, as
well as decontamination and emission line identification for slitless
spectroscopy. Effectively combining the speed of modern computational analyses
with the human capacity to extrapolate from few examples will be critical if
the potential of forthcoming large-scale surveys is to be realized.Comment: 5 pages, 1 figure. To appear in Proceedings of the International
Astronomical Unio
Consensus Network Inference of Microarray Gene Expression Data
Genetic and protein interactions are essential to regulate cellular machinery. Their
identification has become an important aim of systems biology research. In recent years, a
variety of computational network inference algorithms have been employed to reconstruct
gene regulatory networks from post-genomic data. However, precisely predicting these
regulatory networks remains a challenge.
We began our study by assessing the ability of various network inference algorithms
to accurately predict gene regulatory interactions using benchmark simulated datasets. It was
observed from our analysis that different algorithms have strengths and weaknesses when
identifying regulatory networks, with a gene-pair interaction (edge) predicted by one
algorithm not always necessarily consistent with the other. An edge not predicted by most
inference algorithms may be an important one, and should not be missed. The naĂŻve
consensus (intersection) method is perhaps the most conservative approach and can be used
to address this concern by extracting the edges consistently predicted across all inference
algorithms; however, it lacks credibility as it does not provide a quantifiable measure for
edge weights. Existing quantitative consensus approaches, such as the inverse-variance
weighted method (IVWM) and the Borda count election method (BCEM), have been
previously implemented to derive consensus networks from diverse datasets. However, the
former method was biased towards finding local solutions in the whole network, and the
latter considered species diversity to build the consensus network.
In this thesis we proposed a novel consensus approach, in which we used Fishers
Combined Probability Test (FCPT) to combine the statistical significance values assigned to
each network edge by a number of different networking algorithms to produce a consensus
network. We tested our method by applying it to a variety of in silico benchmark expression datasets of different dimensions and evaluated its performance against individual inference
methods, Bayesian models and also existing qualitative and quantitative consensus
techniques. We also applied our approach to real experimental data from the yeast (S.
cerevisiae) network as this network has been comprehensively elucidated previously. Our
results demonstrated that the FCPT-based consensus method outperforms single algorithms in
terms of robustness and accuracy. In developing the consensus approach, we also proposed a
scoring technique that quantifies biologically meaningful hierarchical modular networks.University of Exeter studentshi
- …