163 research outputs found
Generative Mixture of Networks
A generative model based on training deep architectures is proposed. The
model consists of K networks that are trained together to learn the underlying
distribution of a given data set. The process starts with dividing the input
data into K clusters and feeding each of them into a separate network. After
few iterations of training networks separately, we use an EM-like algorithm to
train the networks together and update the clusters of the data. We call this
model Mixture of Networks. The provided model is a platform that can be used
for any deep structure and be trained by any conventional objective function
for distribution modeling. As the components of the model are neural networks,
it has high capability in characterizing complicated data distributions as well
as clustering data. We apply the algorithm on MNIST hand-written digits and
Yale face datasets. We also demonstrate the clustering ability of the model
using some real-world and toy examples.Comment: 9 page
On the Relationship between Sum-Product Networks and Bayesian Networks
In this paper, we establish some theoretical connections between Sum-Product
Networks (SPNs) and Bayesian Networks (BNs). We prove that every SPN can be
converted into a BN in linear time and space in terms of the network size. The
key insight is to use Algebraic Decision Diagrams (ADDs) to compactly represent
the local conditional probability distributions at each node in the resulting
BN by exploiting context-specific independence (CSI). The generated BN has a
simple directed bipartite graphical structure. We show that by applying the
Variable Elimination algorithm (VE) to the generated BN with ADD
representations, we can recover the original SPN where the SPN can be viewed as
a history record or caching of the VE inference process. To help state the
proof clearly, we introduce the notion of {\em normal} SPN and present a
theoretical analysis of the consistency and decomposability properties. We
conclude the paper with some discussion of the implications of the proof and
establish a connection between the depth of an SPN and a lower bound of the
tree-width of its corresponding BN.Comment: Full version of the same paper to appear at ICML-201
Isomorph-Free Branch and Bound Search for Finite State Controllers
The recent proliferation of smart-phones and other wearable devices has lead
to a surge of new mobile applications. Partially observable Markov decision
processes provide a natural framework to design applications that
continuously make decisions based on noisy sensor measurements. However,
given the limited battery life, there is a need to minimize the amount of
online computation. This can be achieved by compiling a policy into a
finite state controller since there is no need for belief monitoring or
online search. In this paper, we propose a new branch and bound technique
to search for a good controller. In contrast to many existing algorithms
for controllers, our search technique is not subject to local optima. We
also show how to reduce the amount of search by avoiding the enumeration of
isomorphic controllers and by taking advantage of suitable upper and lower
bounds. The approach is demonstrated on several benchmark problems as well
as a smart-phone application to assist persons with Alzheimer's to wayfind
Self-Adaptive Hierarchical Sentence Model
The ability to accurately model a sentence at varying stages (e.g.,
word-phrase-sentence) plays a central role in natural language processing. As
an effort towards this goal we propose a self-adaptive hierarchical sentence
model (AdaSent). AdaSent effectively forms a hierarchy of representations from
words to phrases and then to sentences through recursive gated local
composition of adjacent segments. We design a competitive mechanism (through
gating networks) to allow the representations of the same sentence to be
engaged in a particular learning task (e.g., classification), therefore
effectively mitigating the gradient vanishing problem persistent in other
recursive models. Both qualitative and quantitative analysis shows that AdaSent
can automatically form and select the representations suitable for the task at
hand during training, yielding superior classification performance over
competitor models on 5 benchmark data sets.Comment: 8 pages, 7 figures, accepted as a full paper at IJCAI 201
- …