7,315 research outputs found
Discriminative Nonparametric Latent Feature Relational Models with Data Augmentation
We present a discriminative nonparametric latent feature relational model
(LFRM) for link prediction to automatically infer the dimensionality of latent
features. Under the generic RegBayes (regularized Bayesian inference)
framework, we handily incorporate the prediction loss with probabilistic
inference of a Bayesian model; set distinct regularization parameters for
different types of links to handle the imbalance issue in real networks; and
unify the analysis of both the smooth logistic log-loss and the piecewise
linear hinge loss. For the nonconjugate posterior inference, we present a
simple Gibbs sampler via data augmentation, without making restricting
assumptions as done in variational methods. We further develop an approximate
sampler using stochastic gradient Langevin dynamics to handle large networks
with hundreds of thousands of entities and millions of links, orders of
magnitude larger than what existing LFRM models can process. Extensive studies
on various real networks show promising performance.Comment: Accepted by AAAI 201
Detecting Strong Ties Using Network Motifs
Detecting strong ties among users in social and information networks is a
fundamental operation that can improve performance on a multitude of
personalization and ranking tasks. Strong-tie edges are often readily obtained
from the social network as users often participate in multiple overlapping
networks via features such as following and messaging. These networks may vary
greatly in size, density and the information they carry. This setting leads to
a natural strong tie detection task: given a small set of labeled strong tie
edges, how well can one detect unlabeled strong ties in the remainder of the
network?
This task becomes particularly daunting for the Twitter network due to scant
availability of pairwise relationship attribute data, and sparsity of strong
tie networks such as phone contacts. Given these challenges, a natural approach
is to instead use structural network features for the task, produced by {\em
combining} the strong and "weak" edges. In this work, we demonstrate via
experiments on Twitter data that using only such structural network features is
sufficient for detecting strong ties with high precision. These structural
network features are obtained from the presence and frequency of small network
motifs on combined strong and weak ties. We observe that using motifs larger
than triads alleviate sparsity problems that arise for smaller motifs, both due
to increased combinatorial possibilities as well as benefiting strongly from
searching beyond the ego network. Empirically, we observe that not all motifs
are equally useful, and need to be carefully constructed from the combined
edges in order to be effective for strong tie detection. Finally, we reinforce
our experimental findings with providing theoretical justification that
suggests why incorporating these larger sized motifs as features could lead to
increased performance in planted graph models.Comment: To appear in Proceedings of WWW 2017 (Web-science track
Kernel discriminant analysis and clustering with parsimonious Gaussian process models
This work presents a family of parsimonious Gaussian process models which
allow to build, from a finite sample, a model-based classifier in an infinite
dimensional space. The proposed parsimonious models are obtained by
constraining the eigen-decomposition of the Gaussian processes modeling each
class. This allows in particular to use non-linear mapping functions which
project the observations into infinite dimensional spaces. It is also
demonstrated that the building of the classifier can be directly done from the
observation space through a kernel function. The proposed classification method
is thus able to classify data of various types such as categorical data,
functional data or networks. Furthermore, it is possible to classify mixed data
by combining different kernels. The methodology is as well extended to the
unsupervised classification case. Experimental results on various data sets
demonstrate the effectiveness of the proposed method
- …