11,366 research outputs found
Collapsed Variational Bayes Inference of Infinite Relational Model
The Infinite Relational Model (IRM) is a probabilistic model for relational
data clustering that partitions objects into clusters based on observed
relationships. This paper presents Averaged CVB (ACVB) solutions for IRM,
convergence-guaranteed and practically useful fast Collapsed Variational Bayes
(CVB) inferences. We first derive ordinary CVB and CVB0 for IRM based on the
lower bound maximization. CVB solutions yield deterministic iterative
procedures for inferring IRM given the truncated number of clusters. Our
proposal includes CVB0 updates of hyperparameters including the concentration
parameter of the Dirichlet Process, which has not been studied in the
literature. To make the CVB more practically useful, we further study the CVB
inference in two aspects. First, we study the convergence issues and develop a
convergence-guaranteed algorithm for any CVB-based inferences called ACVB,
which enables automatic convergence detection and frees non-expert
practitioners from difficult and costly manual monitoring of inference
processes. Second, we present a few techniques for speeding up IRM inferences.
In particular, we describe the linear time inference of CVB0, allowing the IRM
for larger relational data uses. The ACVB solutions of IRM showed comparable or
better performance compared to existing inference methods in experiments, and
provide deterministic, faster, and easier convergence detection
Variational Particle Approximations
Approximate inference in high-dimensional, discrete probabilistic models is a
central problem in computational statistics and machine learning. This paper
describes discrete particle variational inference (DPVI), a new approach that
combines key strengths of Monte Carlo, variational and search-based techniques.
DPVI is based on a novel family of particle-based variational approximations
that can be fit using simple, fast, deterministic search techniques. Like Monte
Carlo, DPVI can handle multiple modes, and yields exact results in a
well-defined limit. Like unstructured mean-field, DPVI is based on optimizing a
lower bound on the partition function; when this quantity is not of intrinsic
interest, it facilitates convergence assessment and debugging. Like both Monte
Carlo and combinatorial search, DPVI can take advantage of factorization,
sequential structure, and custom search operators. This paper defines DPVI
particle-based approximation family and partition function lower bounds, along
with the sequential DPVI and local DPVI algorithm templates for optimizing
them. DPVI is illustrated and evaluated via experiments on lattice Markov
Random Fields, nonparametric Bayesian mixtures and block-models, and parametric
as well as non-parametric hidden Markov models. Results include applications to
real-world spike-sorting and relational modeling problems, and show that DPVI
can offer appealing time/accuracy trade-offs as compared to multiple
alternatives.Comment: First two authors contributed equally to this wor
Nonparametric Relational Topic Models through Dependent Gamma Processes
Traditional Relational Topic Models provide a way to discover the hidden
topics from a document network. Many theoretical and practical tasks, such as
dimensional reduction, document clustering, link prediction, benefit from this
revealed knowledge. However, existing relational topic models are based on an
assumption that the number of hidden topics is known in advance, and this is
impractical in many real-world applications. Therefore, in order to relax this
assumption, we propose a nonparametric relational topic model in this paper.
Instead of using fixed-dimensional probability distributions in its generative
model, we use stochastic processes. Specifically, a gamma process is assigned
to each document, which represents the topic interest of this document.
Although this method provides an elegant solution, it brings additional
challenges when mathematically modeling the inherent network structure of
typical document network, i.e., two spatially closer documents tend to have
more similar topics. Furthermore, we require that the topics are shared by all
the documents. In order to resolve these challenges, we use a subsampling
strategy to assign each document a different gamma process from the global
gamma process, and the subsampling probabilities of documents are assigned with
a Markov Random Field constraint that inherits the document network structure.
Through the designed posterior inference algorithm, we can discover the hidden
topics and its number simultaneously. Experimental results on both synthetic
and real-world network datasets demonstrate the capabilities of learning the
hidden topics and, more importantly, the number of topics
Learning Hidden Structures with Relational Models by Adequately Involving Rich Information in A Network
Effectively modelling hidden structures in a network is very practical but
theoretically challenging. Existing relational models only involve very limited
information, namely the binary directional link data, embedded in a network to
learn hidden networking structures. There is other rich and meaningful
information (e.g., various attributes of entities and more granular information
than binary elements such as "like" or "dislike") missed, which play a critical
role in forming and understanding relations in a network. In this work, we
propose an informative relational model (InfRM) framework to adequately involve
rich information and its granularity in a network, including metadata
information about each entity and various forms of link data. Firstly, an
effective metadata information incorporation method is employed on the prior
information from relational models MMSB and LFRM. This is to encourage the
entities with similar metadata information to have similar hidden structures.
Secondly, we propose various solutions to cater for alternative forms of link
data. Substantial efforts have been made towards modelling appropriateness and
efficiency, for example, using conjugate priors. We evaluate our framework and
its inference algorithms in different datasets, which shows the generality and
effectiveness of our models in capturing implicit structures in networks
Recommended from our members
Machine learning for activity recognition
This paper surveys the activity recognition task from a machine learning perspective. I give a definition of this problem, and I classify different activity recognition problems into two categories. I show the activities can be hierarchical, and based on such hierarchies I synthesize a language to describe activities. I give a general criteria set to evaluate activity recognition methods. I summarize some off-the-shelf machine learning methods for activity recognition and evaluate them based on this criteria set. Finally, I discuss some methods that I believe can improve the activity recognition performance
Max-Margin Nonparametric Latent Feature Models for Link Prediction
Link prediction is a fundamental task in statistical network analysis. Recent
advances have been made on learning flexible nonparametric Bayesian latent
feature models for link prediction. In this paper, we present a max-margin
learning method for such nonparametric latent feature relational models. Our
approach attempts to unite the ideas of max-margin learning and Bayesian
nonparametrics to discover discriminative latent features for link prediction.
It inherits the advances of nonparametric Bayesian methods to infer the unknown
latent social dimension, while for discriminative link prediction, it adopts
the max-margin learning principle by minimizing a hinge-loss using the linear
expectation operator, without dealing with a highly nonlinear link likelihood
function. For posterior inference, we develop an efficient stochastic
variational inference algorithm under a truncated mean-field assumption. Our
methods can scale up to large-scale real networks with millions of entities and
tens of millions of positive links. We also provide a full Bayesian
formulation, which can avoid tuning regularization hyper-parameters.
Experimental results on a diverse range of real datasets demonstrate the
benefits inherited from max-margin learning and Bayesian nonparametric
inference.Comment: 14 pages, 8 figure
Big Learning with Bayesian Methods
Explosive growth in data and availability of cheap computing resources have
sparked increasing interest in Big learning, an emerging subfield that studies
scalable machine learning algorithms, systems, and applications with Big Data.
Bayesian methods represent one important class of statistic methods for machine
learning, with substantial recent developments on adaptive, flexible and
scalable Bayesian learning. This article provides a survey of the recent
advances in Big learning with Bayesian methods, termed Big Bayesian Learning,
including nonparametric Bayesian methods for adaptively inferring model
complexity, regularized Bayesian inference for improving the flexibility via
posterior regularization, and scalable algorithms and systems based on
stochastic subsampling and distributed computing for dealing with large-scale
applications.Comment: 21 pages, 6 figure
Completely random measures for modeling power laws in sparse graphs
Network data appear in a number of applications, such as online social
networks and biological networks, and there is growing interest in both
developing models for networks as well as studying the properties of such data.
Since individual network datasets continue to grow in size, it is necessary to
develop models that accurately represent the real-life scaling properties of
networks. One behavior of interest is having a power law in the degree
distribution. However, other types of power laws that have been observed
empirically and considered for applications such as clustering and feature
allocation models have not been studied as frequently in models for graph data.
In this paper, we enumerate desirable asymptotic behavior that may be of
interest for modeling graph data, including sparsity and several types of power
laws. We outline a general framework for graph generative models using
completely random measures; by contrast to the pioneering work of Caron and Fox
(2015), we consider instantiating more of the existing atoms of the random
measure as the dataset size increases rather than adding new atoms to the
measure. We see that these two models can be complementary; they respectively
yield interpretations as (1) time passing among existing members of a network
and (2) new individuals joining a network. We detail a particular instance of
this framework and show simulated results that suggest this model exhibits some
desirable asymptotic power-law behavior.Comment: This paper appeared in the NIPS 2015 Workshop on Networks in the
Social and Information Sciences,
http://stanford.edu/~jugander/NetworksNIPS2015
RelNN: A Deep Neural Model for Relational Learning
Statistical relational AI (StarAI) aims at reasoning and learning in noisy
domains described in terms of objects and relationships by combining
probability with first-order logic. With huge advances in deep learning in the
current years, combining deep networks with first-order logic has been the
focus of several recent studies. Many of the existing attempts, however, only
focus on relations and ignore object properties. The attempts that do consider
object properties are limited in terms of modelling power or scalability. In
this paper, we develop relational neural networks (RelNNs) by adding hidden
layers to relational logistic regression (the relational counterpart of
logistic regression). We learn latent properties for objects both directly and
through general rules. Back-propagation is used for training these models. A
modular, layer-wise architecture facilitates utilizing the techniques developed
within deep learning community to our architecture. Initial experiments on
eight tasks over three real-world datasets show that RelNNs are promising
models for relational learning.Comment: 9 pages, 8 figures, accepted at AAAI-201
Deep Generative Models for Relational Data with Side Information
We present a probabilistic framework for overlapping community discovery and
link prediction for relational data, given as a graph. The proposed framework
has: (1) a deep architecture which enables us to infer multiple layers of
latent features/communities for each node, providing superior link prediction
performance on more complex networks and better interpretability of the latent
features; and (2) a regression model which allows directly conditioning the
node latent features on the side information available in form of node
attributes. Our framework handles both (1) and (2) via a clean, unified model,
which enjoys full local conjugacy via data augmentation, and facilitates
efficient inference via closed form Gibbs sampling. Moreover, inference cost
scales in the number of edges which is attractive for massive but sparse
networks. Our framework is also easily extendable to model weighted networks
with count-valued edges. We compare with various state-of-the-art methods and
report results, both quantitative and qualitative, on several benchmark data
sets
- …