40,888 research outputs found
Validating module network learning algorithms using simulated data
In recent years, several authors have used probabilistic graphical models to
learn expression modules and their regulatory programs from gene expression
data. Here, we demonstrate the use of the synthetic data generator SynTReN for
the purpose of testing and comparing module network learning algorithms. We
introduce a software package for learning module networks, called LeMoNe, which
incorporates a novel strategy for learning regulatory programs. Novelties
include the use of a bottom-up Bayesian hierarchical clustering to construct
the regulatory programs, and the use of a conditional entropy measure to assign
regulators to the regulation program nodes. Using SynTReN data, we test the
performance of LeMoNe in a completely controlled situation and assess the
effect of the methodological changes we made with respect to an existing
software package, namely Genomica. Additionally, we assess the effect of
various parameters, such as the size of the data set and the amount of noise,
on the inference performance. Overall, application of Genomica and LeMoNe to
simulated data sets gave comparable results. However, LeMoNe offers some
advantages, one of them being that the learning process is considerably faster
for larger data sets. Additionally, we show that the location of the regulators
in the LeMoNe regulation programs and their conditional entropy may be used to
prioritize regulators for functional validation, and that the combination of
the bottom-up clustering strategy with the conditional entropy-based assignment
of regulators improves the handling of missing or hidden regulators.Comment: 13 pages, 6 figures + 2 pages, 2 figures supplementary informatio
A summary of the 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition
We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding zero resource (unsupervised) speech technologies and related models of early language acquisition. Centered around the tasks of phonetic and lexical discovery, we consider unified evaluation metrics, present two new approaches for improving speaker independence in the absence of supervision, and evaluate the application of Bayesian word segmentation algorithms to automatic subword unit tokenizations. Finally, we present two strategies for integrating zero resource techniques into supervised settings, demonstrating the potential of unsupervised methods to improve mainstream technologies.5 page(s
Input Prioritization for Testing Neural Networks
Deep neural networks (DNNs) are increasingly being adopted for sensing and
control functions in a variety of safety and mission-critical systems such as
self-driving cars, autonomous air vehicles, medical diagnostics, and industrial
robotics. Failures of such systems can lead to loss of life or property, which
necessitates stringent verification and validation for providing high
assurance. Though formal verification approaches are being investigated,
testing remains the primary technique for assessing the dependability of such
systems. Due to the nature of the tasks handled by DNNs, the cost of obtaining
test oracle data---the expected output, a.k.a. label, for a given input---is
high, which significantly impacts the amount and quality of testing that can be
performed. Thus, prioritizing input data for testing DNNs in meaningful ways to
reduce the cost of labeling can go a long way in increasing testing efficacy.
This paper proposes using gauges of the DNN's sentiment derived from the
computation performed by the model, as a means to identify inputs that are
likely to reveal weaknesses. We empirically assessed the efficacy of three such
sentiment measures for prioritization---confidence, uncertainty, and
surprise---and compare their effectiveness in terms of their fault-revealing
capability and retraining effectiveness. The results indicate that sentiment
measures can effectively flag inputs that expose unacceptable DNN behavior. For
MNIST models, the average percentage of inputs correctly flagged ranged from
88% to 94.8%
Dropout Sampling for Robust Object Detection in Open-Set Conditions
Dropout Variational Inference, or Dropout Sampling, has been recently
proposed as an approximation technique for Bayesian Deep Learning and evaluated
for image classification and regression tasks. This paper investigates the
utility of Dropout Sampling for object detection for the first time. We
demonstrate how label uncertainty can be extracted from a state-of-the-art
object detection system via Dropout Sampling. We evaluate this approach on a
large synthetic dataset of 30,000 images, and a real-world dataset captured by
a mobile robot in a versatile campus environment. We show that this uncertainty
can be utilized to increase object detection performance under the open-set
conditions that are typically encountered in robotic vision. A Dropout Sampling
network is shown to achieve a 12.3% increase in recall (for the same precision
score as a standard network) and a 15.1% increase in precision (for the same
recall score as the standard network).Comment: to appear in IEEE International Conference on Robotics and Automation
2018 (ICRA 2018
Bayesian Network Structure Learning with Permutation Tests
In literature there are several studies on the performance of Bayesian
network structure learning algorithms. The focus of these studies is almost
always the heuristics the learning algorithms are based on, i.e. the
maximisation algorithms (in score-based algorithms) or the techniques for
learning the dependencies of each variable (in constraint-based algorithms). In
this paper we investigate how the use of permutation tests instead of
parametric ones affects the performance of Bayesian network structure learning
from discrete data. Shrinkage tests are also covered to provide a broad
overview of the techniques developed in current literature.Comment: 13 pages, 4 figures. Presented at the Conference 'Statistics for
Complex Problems', Padova, June 15, 201
- …