7,315 research outputs found

    Discriminative Nonparametric Latent Feature Relational Models with Data Augmentation

    Full text link
    We present a discriminative nonparametric latent feature relational model (LFRM) for link prediction to automatically infer the dimensionality of latent features. Under the generic RegBayes (regularized Bayesian inference) framework, we handily incorporate the prediction loss with probabilistic inference of a Bayesian model; set distinct regularization parameters for different types of links to handle the imbalance issue in real networks; and unify the analysis of both the smooth logistic log-loss and the piecewise linear hinge loss. For the nonconjugate posterior inference, we present a simple Gibbs sampler via data augmentation, without making restricting assumptions as done in variational methods. We further develop an approximate sampler using stochastic gradient Langevin dynamics to handle large networks with hundreds of thousands of entities and millions of links, orders of magnitude larger than what existing LFRM models can process. Extensive studies on various real networks show promising performance.Comment: Accepted by AAAI 201

    Detecting Strong Ties Using Network Motifs

    Full text link
    Detecting strong ties among users in social and information networks is a fundamental operation that can improve performance on a multitude of personalization and ranking tasks. Strong-tie edges are often readily obtained from the social network as users often participate in multiple overlapping networks via features such as following and messaging. These networks may vary greatly in size, density and the information they carry. This setting leads to a natural strong tie detection task: given a small set of labeled strong tie edges, how well can one detect unlabeled strong ties in the remainder of the network? This task becomes particularly daunting for the Twitter network due to scant availability of pairwise relationship attribute data, and sparsity of strong tie networks such as phone contacts. Given these challenges, a natural approach is to instead use structural network features for the task, produced by {\em combining} the strong and "weak" edges. In this work, we demonstrate via experiments on Twitter data that using only such structural network features is sufficient for detecting strong ties with high precision. These structural network features are obtained from the presence and frequency of small network motifs on combined strong and weak ties. We observe that using motifs larger than triads alleviate sparsity problems that arise for smaller motifs, both due to increased combinatorial possibilities as well as benefiting strongly from searching beyond the ego network. Empirically, we observe that not all motifs are equally useful, and need to be carefully constructed from the combined edges in order to be effective for strong tie detection. Finally, we reinforce our experimental findings with providing theoretical justification that suggests why incorporating these larger sized motifs as features could lead to increased performance in planted graph models.Comment: To appear in Proceedings of WWW 2017 (Web-science track

    Kernel discriminant analysis and clustering with parsimonious Gaussian process models

    Full text link
    This work presents a family of parsimonious Gaussian process models which allow to build, from a finite sample, a model-based classifier in an infinite dimensional space. The proposed parsimonious models are obtained by constraining the eigen-decomposition of the Gaussian processes modeling each class. This allows in particular to use non-linear mapping functions which project the observations into infinite dimensional spaces. It is also demonstrated that the building of the classifier can be directly done from the observation space through a kernel function. The proposed classification method is thus able to classify data of various types such as categorical data, functional data or networks. Furthermore, it is possible to classify mixed data by combining different kernels. The methodology is as well extended to the unsupervised classification case. Experimental results on various data sets demonstrate the effectiveness of the proposed method
    • …
    corecore