Search CORE

19,747 research outputs found

Learning Bayesian Nets that Perform Well

Author: Greiner Russell
Grove Adam J.
Schuurmans Dale
Publication venue
Publication date: 06/02/2013
Field of study

A Bayesian net (BN) is more than a succinct way to encode a probabilistic distribution; it also corresponds to a function used to answer queries. A BN can therefore be evaluated by the accuracy of the answers it returns. Many algorithms for learning BNs, however, attempt to optimize another criterion (usually likelihood, possibly augmented with a regularizing term), which is independent of the distribution of queries that are posed. This paper takes the "performance criteria" seriously, and considers the challenge of computing the BN whose performance - read "accuracy over the distribution of queries" - is optimal. We show that many aspects of this learning task are more difficult than the corresponding subtasks in the standard model.Comment: Appears in Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI1997

arXiv.org e-Print Archive

Multi-task Neural Networks for QSAR Predictions

Author: Dahl George E.
Jaitly Navdeep
Salakhutdinov Ruslan
Publication venue
Publication date: 04/06/2014
Field of study

Although artificial neural networks have occasionally been used for Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) studies in the past, the literature has of late been dominated by other machine learning techniques such as random forests. However, a variety of new neural net techniques along with successful applications in other domains have renewed interest in network approaches. In this work, inspired by the winning team's use of neural networks in a recent QSAR competition, we used an artificial neural network to learn a function that predicts activities of compounds for multiple assays at the same time. We conducted experiments leveraging recent methods for dealing with overfitting in neural networks as well as other tricks from the neural networks literature. We compared our methods to alternative methods reported to perform well on these tasks and found that our neural net methods provided superior performance

arXiv.org e-Print Archive

Do Deep Convolutional Nets Really Need to be Deep and Convolutional?

Author: Aslan Ozlem
Caruana Rich
Geras Krzysztof J.
Kahou Samira Ebrahimi
Mohamed Abdelrahman
Philipose Matthai
Richardson Matt
Urban Gregor
Wang Shengjie
Publication venue
Publication date: 03/03/2017
Field of study

Yes, they do. This paper provides the first empirical demonstration that deep convolutional models really need to be both deep and convolutional, even when trained with methods such as distillation that allow small or shallow models of high accuracy to be trained. Although previous research showed that shallow feed-forward nets sometimes can learn the complex functions previously learned by deep nets while using the same number of parameters as the deep models they mimic, in this paper we demonstrate that the same methods cannot be used to train accurate models on CIFAR-10 unless the student models contain multiple layers of convolution. Although the student models do not have to be as deep as the teacher model they mimic, the students need multiple convolutional layers to learn functions of comparable accuracy as the deep convolutional teacher

arXiv.org e-Print Archive

Bayesian Conditional Generative Adverserial Networks

Author: Abbasnejad Iman
Abbasnejad M. Ehsan
Dick Anthony
Hengel Anton van den
Shi Qinfeng
Publication venue
Publication date: 17/06/2017
Field of study

Traditional GANs use a deterministic generator function (typically a neural network) to transform a random noise input

z

to a sample

\mathbf{x}

that the discriminator seeks to distinguish. We propose a new GAN called Bayesian Conditional Generative Adversarial Networks (BC-GANs) that use a random generator function to transform a deterministic input

y'

to a sample

\mathbf{x}

. Our BC-GANs extend traditional GANs to a Bayesian framework, and naturally handle unsupervised learning, supervised learning, and semi-supervised learning problems. Experiments show that the proposed BC-GANs outperforms the state-of-the-arts

arXiv.org e-Print Archive

Trading-off Accuracy and Energy of Deep Inference on Embedded Systems: A Co-Design Approach

Author: Chatterjee Anwesha
Choi Wonje
Doppa Janardhan Rao
Jayakodi Nitthilan Kannappan
Pande Partha Pratim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/01/2019
Field of study

Deep neural networks have seen tremendous success for different modalities of data including images, videos, and speech. This success has led to their deployment in mobile and embedded systems for real-time applications. However, making repeated inferences using deep networks on embedded systems poses significant challenges due to constrained resources (e.g., energy and computing power). To address these challenges, we develop a principled co-design approach. Building on prior work, we develop a formalism referred to as Coarse-to-Fine Networks (C2F Nets) that allow us to employ classifiers of varying complexity to make predictions. We propose a principled optimization algorithm to automatically configure C2F Nets for a specified trade-off between accuracy and energy consumption for inference. The key idea is to select a classifier on-the-fly whose complexity is proportional to the hardness of the input example: simple classifiers for easy inputs and complex classifiers for hard inputs. We perform comprehensive experimental evaluation using four different C2F Net architectures on multiple real-world image classification tasks. Our results show that optimized C2F Net can reduce the Energy Delay Product (EDP) by 27 to 60 percent with no loss in accuracy when compared to the baseline solution, where all predictions are made using the most complex classifier in C2F Net.Comment: Published in IEEE Trans. on CAD of Integrated Circuits and System

arXiv.org e-Print Archive

Mix-nets: Factored Mixtures of Gaussians in Bayesian Networks With Mixed Continuous And Discrete Variables

Author: Davies Scott
Moore Andrew
Publication venue
Publication date: 16/01/2013
Field of study

Recently developed techniques have made it possible to quickly learn accurate probability density functions from data in low-dimensional continuous space. In particular, mixtures of Gaussians can be fitted to data very quickly using an accelerated EM algorithm that employs multiresolution kd-trees (Moore, 1999). In this paper, we propose a kind of Bayesian networks in which low-dimensional mixtures of Gaussians over different subsets of the domain's variables are combined into a coherent joint probability model over the entire domain. The network is also capable of modeling complex dependencies between discrete variables and continuous variables without requiring discretization of the continuous variables. We present efficient heuristic algorithms for automatically learning these networks from data, and perform comparative experiments illustrated how well these networks model real scientific data and synthetic data. We also briefly discuss some possible improvements to the networks, as well as possible applications.Comment: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000

arXiv.org e-Print Archive

Bayesian Hypernetworks

Author: Courville Aaron
Huang Chin-Wei
Islam Riashat
Krueger David
Lacoste Alexandre
Turner Ryan
Publication venue
Publication date: 24/04/2018
Field of study

We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork \h is a neural network which learns to transform a simple noise distribution, p(\vec\epsilon) = \N(\vec 0,\mat I), to a distribution q(\pp) := q(h(\vec\epsilon)) over the parameters \pp of another neural network (the "primary network")\@. We train

q

with variational inference, using an invertible \h to enable efficient estimation of the variational lower bound on the posterior p(\pp | \D) via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap iid sampling of~q(\pp). In practice, Bayesian hypernets can provide a better defense against adversarial examples than dropout, and also exhibit competitive performance on a suite of tasks which evaluate model uncertainty, including regularization, active learning, and anomaly detection.Comment: David Krueger and Chin-Wei Huang contributed equall

arXiv.org e-Print Archive

Deep Nets: What have they ever done for Vision?

Author: Liu Chenxi
Yuille Alan L.
Publication venue
Publication date: 10/01/2019
Field of study

This is an opinion paper about the strengths and weaknesses of Deep Nets for vision. They are at the center of recent progress on artificial intelligence and are of growing importance in cognitive science and neuroscience. They have enormous successes but also clear limitations. There is also only partial understanding of their inner workings. It seems unlikely that Deep Nets in their current form will be the best long-term solution either for building general purpose intelligent machines or for understanding the mind/brain, but it is likely that many aspects of them will remain. At present Deep Nets do very well on specific types of visual tasks and on specific benchmarked datasets. But Deep Nets are much less general purpose, flexible, and adaptive than the human visual system. Moreover, methods like Deep Nets may run into fundamental difficulties when faced with the enormous complexity of natural images which can lead to a combinatorial explosion. To illustrate our main points, while keeping the references small, this paper is slightly biased towards work from our group.Comment: Major update and reorganization. 16 pages, 10 figure

arXiv.org e-Print Archive

Towards Interrogating Discriminative Machine Learning Models

Author: Guo Wenbo
Huang Sui
Lin Lin
Xing Xinyu
Zhang Kaixuan
Publication venue
Publication date: 23/05/2017
Field of study

It is oftentimes impossible to understand how machine learning models reach a decision. While recent research has proposed various technical approaches to provide some clues as to how a learning model makes individual decisions, they cannot provide users with ability to inspect a learning model as a complete entity. In this work, we propose a new technical approach that augments a Bayesian regression mixture model with multiple elastic nets. Using the enhanced mixture model, we extract explanations for a target model through global approximation. To demonstrate the utility of our approach, we evaluate it on different learning models covering the tasks of text mining and image recognition. Our results indicate that the proposed approach not only outperforms the state-of-the-art technique in explaining individual decisions but also provides users with an ability to discover the vulnerabilities of a learning model

arXiv.org e-Print Archive

An Experiment on Using Bayesian Networks for Process Mining

Author: Moreira Catarina
Publication venue
Publication date: 25/03/2015
Field of study

Process mining is a technique that performs an automatic analysis of business processes from a log of events with the promise of understanding how processes are executed in an organisation. Several models have been proposed to address this problem, however, here we propose a different approach to deal with uncertainty. By uncertainty, we mean estimating the probability of some sequence of tasks occurring in a business process, given that only a subset of tasks may be observable. In this sense, this work proposes a new approach to perform process mining using Bayesian Networks. These structures can take into account the probability of a task being present or absent in the business process. Moreover, Bayesian Networks are able to automatically learn these probabilities through mechanisms such as the maximum likelihood estimate and EM clustering. Experiments made over a Loan Application Case study suggest that Bayesian Networks are adequate structures for process mining and enable a deep analysis of the business process model that can be used to answer queries about that process

arXiv.org e-Print Archive