11,963 research outputs found
State-space Abstraction for Anytime Evaluation of Probabilistic Networks
One important factor determining the computational complexity of evaluating a
probabilistic network is the cardinality of the state spaces of the nodes. By
varying the granularity of the state spaces, one can trade off accuracy in the
result for computational efficiency. We present an anytime procedure for
approximate evaluation of probabilistic networks based on this idea. On
application to some simple networks, the procedure exhibits a smooth
improvement in approximation quality as computation time increases. This
suggests that state-space abstraction is one more useful control parameter for
designing real-time probabilistic reasoners.Comment: Appears in Proceedings of the Tenth Conference on Uncertainty in
Artificial Intelligence (UAI1994
ZhuSuan: A Library for Bayesian Deep Learning
In this paper we introduce ZhuSuan, a python probabilistic programming
library for Bayesian deep learning, which conjoins the complimentary advantages
of Bayesian methods and deep learning. ZhuSuan is built upon Tensorflow. Unlike
existing deep learning libraries, which are mainly designed for deterministic
neural networks and supervised tasks, ZhuSuan is featured for its deep root
into Bayesian inference, thus supporting various kinds of probabilistic models,
including both the traditional hierarchical Bayesian models and recent deep
generative models. We use running examples to illustrate the probabilistic
programming on ZhuSuan, including Bayesian logistic regression, variational
auto-encoders, deep sigmoid belief networks and Bayesian recurrent neural
networks.Comment: The GitHub page is at https://github.com/thu-ml/zhusua
Max-Entropy Feed-Forward Clustering Neural Network
The outputs of non-linear feed-forward neural network are positive, which
could be treated as probability when they are normalized to one. If we take
Entropy-Based Principle into consideration, the outputs for each sample could
be represented as the distribution of this sample for different clusters.
Entropy-Based Principle is the principle with which we could estimate the
unknown distribution under some limited conditions. As this paper defines two
processes in Feed-Forward Neural Network, our limited condition is the
abstracted features of samples which are worked out in the abstraction process.
And the final outputs are the probability distribution for different clusters
in the clustering process. As Entropy-Based Principle is considered into the
feed-forward neural network, a clustering method is born. We have conducted
some experiments on six open UCI datasets, comparing with a few baselines and
applied purity as the measurement . The results illustrate that our method
outperforms all the other baselines that are most popular clustering methods.Comment: This paper has been published in ICANN 201
The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies
We present the nested Chinese restaurant process (nCRP), a stochastic process
which assigns probability distributions to infinitely-deep,
infinitely-branching trees. We show how this stochastic process can be used as
a prior distribution in a Bayesian nonparametric model of document collections.
Specifically, we present an application to information retrieval in which
documents are modeled as paths down a random tree, and the preferential
attachment dynamics of the nCRP leads to clustering of documents according to
sharing of topics at multiple levels of abstraction. Given a corpus of
documents, a posterior inference algorithm finds an approximation to a
posterior distribution over trees, topics and allocations of words to levels of
the tree. We demonstrate this algorithm on collections of scientific abstracts
from several journals. This model exemplifies a recent trend in statistical
machine learning--the use of Bayesian nonparametric methods to infer
distributions on flexible data structures
Simple, Distributed, and Accelerated Probabilistic Programming
We describe a simple, low-level approach for embedding probabilistic
programming in a deep learning ecosystem. In particular, we distill
probabilistic programming down to a single abstraction---the random variable.
Our lightweight implementation in TensorFlow enables numerous applications: a
model-parallel variational auto-encoder (VAE) with 2nd-generation tensor
processing units (TPUv2s); a data-parallel autoregressive model (Image
Transformer) with TPUv2s; and multi-GPU No-U-Turn Sampler (NUTS). For both a
state-of-the-art VAE on 64x64 ImageNet and Image Transformer on 256x256
CelebA-HQ, our approach achieves an optimal linear speedup from 1 to 256 TPUv2
chips. With NUTS, we see a 100x speedup on GPUs over Stan and 37x over PyMC3.Comment: Appears in Neural Information Processing Systems, 2018. Code
available at http://bit.ly/2JpFip
From Images to Sentences through Scene Description Graphs using Commonsense Reasoning and Knowledge
In this paper we propose the construction of linguistic descriptions of
images. This is achieved through the extraction of scene description graphs
(SDGs) from visual scenes using an automatically constructed knowledge base.
SDGs are constructed using both vision and reasoning. Specifically, commonsense
reasoning is applied on (a) detections obtained from existing perception
methods on given images, (b) a "commonsense" knowledge base constructed using
natural language processing of image annotations and (c) lexical ontological
knowledge from resources such as WordNet. Amazon Mechanical Turk(AMT)-based
evaluations on Flickr8k, Flickr30k and MS-COCO datasets show that in most
cases, sentences auto-constructed from SDGs obtained by our method give a more
relevant and thorough description of an image than a recent state-of-the-art
image caption based approach. Our Image-Sentence Alignment Evaluation results
are also comparable to that of the recent state-of-the art approaches
Big Learning with Bayesian Methods
Explosive growth in data and availability of cheap computing resources have
sparked increasing interest in Big learning, an emerging subfield that studies
scalable machine learning algorithms, systems, and applications with Big Data.
Bayesian methods represent one important class of statistic methods for machine
learning, with substantial recent developments on adaptive, flexible and
scalable Bayesian learning. This article provides a survey of the recent
advances in Big learning with Bayesian methods, termed Big Bayesian Learning,
including nonparametric Bayesian methods for adaptively inferring model
complexity, regularized Bayesian inference for improving the flexibility via
posterior regularization, and scalable algorithms and systems based on
stochastic subsampling and distributed computing for dealing with large-scale
applications.Comment: 21 pages, 6 figure
AliGraph: A Comprehensive Graph Neural Network Platform
An increasing number of machine learning tasks require dealing with large
graph datasets, which capture rich and complex relationship among potentially
billions of elements. Graph Neural Network (GNN) becomes an effective way to
address the graph learning problem by converting the graph data into a low
dimensional space while keeping both the structural and property information to
the maximum extent and constructing a neural network for training and
referencing. However, it is challenging to provide an efficient graph storage
and computation capabilities to facilitate GNN training and enable development
of new GNN algorithms. In this paper, we present a comprehensive graph neural
network system, namely AliGraph, which consists of distributed graph storage,
optimized sampling operators and runtime to efficiently support not only
existing popular GNNs but also a series of in-house developed ones for
different scenarios. The system is currently deployed at Alibaba to support a
variety of business scenarios, including product recommendation and
personalized search at Alibaba's E-Commerce platform. By conducting extensive
experiments on a real-world dataset with 492.90 million vertices, 6.82 billion
edges and rich attributes, AliGraph performs an order of magnitude faster in
terms of graph building (5 minutes vs hours reported from the state-of-the-art
PowerGraph platform). At training, AliGraph runs 40%-50% faster with the novel
caching strategy and demonstrates around 12 times speed up with the improved
runtime. In addition, our in-house developed GNN models all showcase their
statistically significant superiorities in terms of both effectiveness and
efficiency (e.g., 4.12%-17.19% lift by F1 scores)
Applications of Probabilistic Programming (Master's thesis, 2015)
This thesis describes work on two applications of probabilistic programming:
the learning of probabilistic program code given specifications, in particular
program code of one-dimensional samplers; and the facilitation of sequential
Monte Carlo inference with help of data-driven proposals. The latter is
presented with experimental results on a linear Gaussian model and a
non-parametric dependent Dirichlet process mixture of objects model for object
recognition and tracking.
In Chapter 1 we provide a brief introduction to probabilistic programming.
In Chapter 2 we present an approach to automatic discovery of samplers in the
form of probabilistic programs. We formulate a Bayesian approach to this
problem by specifying a grammar-based prior over probabilistic program code. We
use an approximate Bayesian computation method to learn the programs, whose
executions generate samples that statistically match observed data or
analytical characteristics of distributions of interest. In our experiments we
leverage different probabilistic programming systems to perform Markov chain
Monte Carlo sampling over the space of programs. Experimental results have
demonstrated that, using the proposed methodology, we can learn approximate and
even some exact samplers. Finally, we show that our results are competitive
with regard to genetic programming methods.
In Chapter 3, we describe a way to facilitate sequential Monte Carlo
inference in probabilistic programming using data-driven proposals. In
particular, we develop a distance-based proposal for the non-parametric
dependent Dirichlet process mixture of objects model. We implement this
approach in the probabilistic programming system Anglican, and show that for
that model data-driven proposals provide significant performance improvements.
We also explore the possibility of using neural networks to improve data-driven
proposals.Comment: Supervisor: Frank Wood. The thesis was prepared in the Department of
Engineering Science at the University of Oxfor
Some Experiments with Real-Time Decision Algorithms
Real-time Decision algorithms are a class of incremental resource-bounded
[Horvitz, 89] or anytime [Dean, 93] algorithms for evaluating influence
diagrams. We present a test domain for real-time decision algorithms, and the
results of experiments with several Real-time Decision Algorithms in this
domain. The results demonstrate high performance for two algorithms, a
decision-evaluation variant of Incremental Probabilisitic Inference [D'Ambrosio
93] and a variant of an algorithm suggested by Goldszmidt, [Goldszmidt, 95],
PK-reduced. We discuss the implications of these experimental results and
explore the broader applicability of these algorithms.Comment: Appears in Proceedings of the Twelfth Conference on Uncertainty in
Artificial Intelligence (UAI1996
- …