18,350 research outputs found
The Capacity of Adaptive Group Testing
We define capacity for group testing problems and deduce bounds for the
capacity of a variety of noisy models, based on the capacity of equivalent
noisy communication channels. For noiseless adaptive group testing we prove an
information-theoretic lower bound which tightens a bound of Chan et al. This
can be combined with a performance analysis of a version of Hwang's adaptive
group testing algorithm, in order to deduce the capacity of noiseless and
erasure group testing models.Comment: 5 page
The capacity of non-identical adaptive group testing
We consider the group testing problem, in the case where the items are
defective independently but with non-constant probability. We introduce and
analyse an algorithm to solve this problem by grouping items together
appropriately. We give conditions under which the algorithm performs
essentially optimally in the sense of information-theoretic capacity. We use
concentration of measure results to bound the probability that this algorithm
requires many more tests than the expected number. This has applications to the
allocation of spectrum to cognitive radios, in the case where a database gives
prior information that a particular band will be occupied.Comment: To be presented at Allerton 201
On PAC-Bayesian Bounds for Random Forests
Existing guarantees in terms of rigorous upper bounds on the generalization
error for the original random forest algorithm, one of the most frequently used
machine learning methods, are unsatisfying. We discuss and evaluate various
PAC-Bayesian approaches to derive such bounds. The bounds do not require
additional hold-out data, because the out-of-bag samples from the bagging in
the training process can be exploited. A random forest predicts by taking a
majority vote of an ensemble of decision trees. The first approach is to bound
the error of the vote by twice the error of the corresponding Gibbs classifier
(classifying with a single member of the ensemble selected at random). However,
this approach does not take into account the effect of averaging out of errors
of individual classifiers when taking the majority vote. This effect provides a
significant boost in performance when the errors are independent or negatively
correlated, but when the correlations are strong the advantage from taking the
majority vote is small. The second approach based on PAC-Bayesian C-bounds
takes dependencies between ensemble members into account, but it requires
estimating correlations between the errors of the individual classifiers. When
the correlations are high or the estimation is poor, the bounds degrade. In our
experiments, we compute generalization bounds for random forests on various
benchmark data sets. Because the individual decision trees already perform
well, their predictions are highly correlated and the C-bounds do not lead to
satisfactory results. For the same reason, the bounds based on the analysis of
Gibbs classifiers are typically superior and often reasonably tight. Bounds
based on a validation set coming at the cost of a smaller training set gave
better performance guarantees, but worse performance in most experiments
A Unified View of Piecewise Linear Neural Network Verification
The success of Deep Learning and its potential use in many safety-critical
applications has motivated research on formal verification of Neural Network
(NN) models. Despite the reputation of learned NN models to behave as black
boxes and the theoretical hardness of proving their properties, researchers
have been successful in verifying some classes of models by exploiting their
piecewise linear structure and taking insights from formal methods such as
Satisifiability Modulo Theory. These methods are however still far from scaling
to realistic neural networks. To facilitate progress on this crucial area, we
make two key contributions. First, we present a unified framework that
encompasses previous methods. This analysis results in the identification of
new methods that combine the strengths of multiple existing approaches,
accomplishing a speedup of two orders of magnitude compared to the previous
state of the art. Second, we propose a new data set of benchmarks which
includes a collection of previously released testcases. We use the benchmark to
provide the first experimental comparison of existing algorithms and identify
the factors impacting the hardness of verification problems.Comment: Updated version of "Piecewise Linear Neural Network verification: A
comparative study
Improved Accuracy and Parallelism for MRRR-based Eigensolvers -- A Mixed Precision Approach
The real symmetric tridiagonal eigenproblem is of outstanding importance in
numerical computations; it arises frequently as part of eigensolvers for
standard and generalized dense Hermitian eigenproblems that are based on a
reduction to tridiagonal form. For its solution, the algorithm of Multiple
Relatively Robust Representations (MRRR) is among the fastest methods. Although
fast, the solvers based on MRRR do not deliver the same accuracy as competing
methods like Divide & Conquer or the QR algorithm. In this paper, we
demonstrate that the use of mixed precisions leads to improved accuracy of
MRRR-based eigensolvers with limited or no performance penalty. As a result, we
obtain eigensolvers that are not only equally or more accurate than the best
available methods, but also -in most circumstances- faster and more scalable
than the competition
Optimal Nested Test Plan for Combinatorial Quantitative Group Testing
We consider the quantitative group testing problem where the objective is to
identify defective items in a given population based on results of tests
performed on subsets of the population. Under the quantitative group testing
model, the result of each test reveals the number of defective items in the
tested group. The minimum number of tests achievable by nested test plans was
established by Aigner and Schughart in 1985 within a minimax framework. The
optimal nested test plan offering this performance, however, was not obtained.
In this work, we establish the optimal nested test plan in closed form. This
optimal nested test plan is also order optimal among all test plans as the
population size approaches infinity. Using heavy-hitter detection as a case
study, we show via simulation examples orders of magnitude improvement of the
group testing approach over two prevailing sampling-based approaches in
detection accuracy and counter consumption. Other applications include anomaly
detection and wideband spectrum sensing in cognitive radio systems
- …