19,747 research outputs found
Learning Bayesian Nets that Perform Well
A Bayesian net (BN) is more than a succinct way to encode a probabilistic
distribution; it also corresponds to a function used to answer queries. A BN
can therefore be evaluated by the accuracy of the answers it returns. Many
algorithms for learning BNs, however, attempt to optimize another criterion
(usually likelihood, possibly augmented with a regularizing term), which is
independent of the distribution of queries that are posed. This paper takes the
"performance criteria" seriously, and considers the challenge of computing the
BN whose performance - read "accuracy over the distribution of queries" - is
optimal. We show that many aspects of this learning task are more difficult
than the corresponding subtasks in the standard model.Comment: Appears in Proceedings of the Thirteenth Conference on Uncertainty in
Artificial Intelligence (UAI1997
Multi-task Neural Networks for QSAR Predictions
Although artificial neural networks have occasionally been used for
Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) studies in
the past, the literature has of late been dominated by other machine learning
techniques such as random forests. However, a variety of new neural net
techniques along with successful applications in other domains have renewed
interest in network approaches. In this work, inspired by the winning team's
use of neural networks in a recent QSAR competition, we used an artificial
neural network to learn a function that predicts activities of compounds for
multiple assays at the same time. We conducted experiments leveraging recent
methods for dealing with overfitting in neural networks as well as other tricks
from the neural networks literature. We compared our methods to alternative
methods reported to perform well on these tasks and found that our neural net
methods provided superior performance
Do Deep Convolutional Nets Really Need to be Deep and Convolutional?
Yes, they do. This paper provides the first empirical demonstration that deep
convolutional models really need to be both deep and convolutional, even when
trained with methods such as distillation that allow small or shallow models of
high accuracy to be trained. Although previous research showed that shallow
feed-forward nets sometimes can learn the complex functions previously learned
by deep nets while using the same number of parameters as the deep models they
mimic, in this paper we demonstrate that the same methods cannot be used to
train accurate models on CIFAR-10 unless the student models contain multiple
layers of convolution. Although the student models do not have to be as deep as
the teacher model they mimic, the students need multiple convolutional layers
to learn functions of comparable accuracy as the deep convolutional teacher
Bayesian Conditional Generative Adverserial Networks
Traditional GANs use a deterministic generator function (typically a neural
network) to transform a random noise input to a sample that
the discriminator seeks to distinguish. We propose a new GAN called Bayesian
Conditional Generative Adversarial Networks (BC-GANs) that use a random
generator function to transform a deterministic input to a sample
. Our BC-GANs extend traditional GANs to a Bayesian framework, and
naturally handle unsupervised learning, supervised learning, and
semi-supervised learning problems. Experiments show that the proposed BC-GANs
outperforms the state-of-the-arts
Trading-off Accuracy and Energy of Deep Inference on Embedded Systems: A Co-Design Approach
Deep neural networks have seen tremendous success for different modalities of
data including images, videos, and speech. This success has led to their
deployment in mobile and embedded systems for real-time applications. However,
making repeated inferences using deep networks on embedded systems poses
significant challenges due to constrained resources (e.g., energy and computing
power). To address these challenges, we develop a principled co-design
approach. Building on prior work, we develop a formalism referred to as
Coarse-to-Fine Networks (C2F Nets) that allow us to employ classifiers of
varying complexity to make predictions. We propose a principled optimization
algorithm to automatically configure C2F Nets for a specified trade-off between
accuracy and energy consumption for inference. The key idea is to select a
classifier on-the-fly whose complexity is proportional to the hardness of the
input example: simple classifiers for easy inputs and complex classifiers for
hard inputs. We perform comprehensive experimental evaluation using four
different C2F Net architectures on multiple real-world image classification
tasks. Our results show that optimized C2F Net can reduce the Energy Delay
Product (EDP) by 27 to 60 percent with no loss in accuracy when compared to the
baseline solution, where all predictions are made using the most complex
classifier in C2F Net.Comment: Published in IEEE Trans. on CAD of Integrated Circuits and System
Mix-nets: Factored Mixtures of Gaussians in Bayesian Networks With Mixed Continuous And Discrete Variables
Recently developed techniques have made it possible to quickly learn accurate
probability density functions from data in low-dimensional continuous space. In
particular, mixtures of Gaussians can be fitted to data very quickly using an
accelerated EM algorithm that employs multiresolution kd-trees (Moore, 1999).
In this paper, we propose a kind of Bayesian networks in which low-dimensional
mixtures of Gaussians over different subsets of the domain's variables are
combined into a coherent joint probability model over the entire domain. The
network is also capable of modeling complex dependencies between discrete
variables and continuous variables without requiring discretization of the
continuous variables. We present efficient heuristic algorithms for
automatically learning these networks from data, and perform comparative
experiments illustrated how well these networks model real scientific data and
synthetic data. We also briefly discuss some possible improvements to the
networks, as well as possible applications.Comment: Appears in Proceedings of the Sixteenth Conference on Uncertainty in
Artificial Intelligence (UAI2000
Bayesian Hypernetworks
We study Bayesian hypernetworks: a framework for approximate Bayesian
inference in neural networks. A Bayesian hypernetwork \h is a neural network
which learns to transform a simple noise distribution, p(\vec\epsilon) =
\N(\vec 0,\mat I), to a distribution q(\pp) := q(h(\vec\epsilon)) over the
parameters \pp of another neural network (the "primary network")\@. We train
with variational inference, using an invertible \h to enable efficient
estimation of the variational lower bound on the posterior p(\pp | \D) via
sampling. In contrast to most methods for Bayesian deep learning, Bayesian
hypernets can represent a complex multimodal approximate posterior with
correlations between parameters, while enabling cheap iid sampling of~q(\pp).
In practice, Bayesian hypernets can provide a better defense against
adversarial examples than dropout, and also exhibit competitive performance on
a suite of tasks which evaluate model uncertainty, including regularization,
active learning, and anomaly detection.Comment: David Krueger and Chin-Wei Huang contributed equall
Deep Nets: What have they ever done for Vision?
This is an opinion paper about the strengths and weaknesses of Deep Nets for
vision. They are at the center of recent progress on artificial intelligence
and are of growing importance in cognitive science and neuroscience. They have
enormous successes but also clear limitations. There is also only partial
understanding of their inner workings. It seems unlikely that Deep Nets in
their current form will be the best long-term solution either for building
general purpose intelligent machines or for understanding the mind/brain, but
it is likely that many aspects of them will remain. At present Deep Nets do
very well on specific types of visual tasks and on specific benchmarked
datasets. But Deep Nets are much less general purpose, flexible, and adaptive
than the human visual system. Moreover, methods like Deep Nets may run into
fundamental difficulties when faced with the enormous complexity of natural
images which can lead to a combinatorial explosion. To illustrate our main
points, while keeping the references small, this paper is slightly biased
towards work from our group.Comment: Major update and reorganization. 16 pages, 10 figure
Towards Interrogating Discriminative Machine Learning Models
It is oftentimes impossible to understand how machine learning models reach a
decision. While recent research has proposed various technical approaches to
provide some clues as to how a learning model makes individual decisions, they
cannot provide users with ability to inspect a learning model as a complete
entity. In this work, we propose a new technical approach that augments a
Bayesian regression mixture model with multiple elastic nets. Using the
enhanced mixture model, we extract explanations for a target model through
global approximation. To demonstrate the utility of our approach, we evaluate
it on different learning models covering the tasks of text mining and image
recognition. Our results indicate that the proposed approach not only
outperforms the state-of-the-art technique in explaining individual decisions
but also provides users with an ability to discover the vulnerabilities of a
learning model
An Experiment on Using Bayesian Networks for Process Mining
Process mining is a technique that performs an automatic analysis of business
processes from a log of events with the promise of understanding how processes
are executed in an organisation.
Several models have been proposed to address this problem, however, here we
propose a different approach to deal with uncertainty. By uncertainty, we mean
estimating the probability of some sequence of tasks occurring in a business
process, given that only a subset of tasks may be observable.
In this sense, this work proposes a new approach to perform process mining
using Bayesian Networks. These structures can take into account the probability
of a task being present or absent in the business process. Moreover, Bayesian
Networks are able to automatically learn these probabilities through mechanisms
such as the maximum likelihood estimate and EM clustering.
Experiments made over a Loan Application Case study suggest that Bayesian
Networks are adequate structures for process mining and enable a deep analysis
of the business process model that can be used to answer queries about that
process
- …