Search CORE

2,820 research outputs found

Infinite Author Topic Model based on Mixed Gamma-Negative Binomial Process.

Author: Lu J
Luo X
Xu RYD
Xuan J
Zhang G
Publication venue
Publication date: 01/01/2015
Field of study

Incorporating the side information of text corpus, i.e., authors, time stamps, and emotional tags, into the traditional text mining models has gained significant interests in the area of information retrieval, statistical natural language processing, and machine learning. One branch of these works is the so-called Author Topic Model (ATM), which incorporates the authors's interests as side information into the classical topic model. However, the existing ATM needs to predefine the number of topics, which is difficult and inappropriate in many real-world settings. In this paper, we propose an Infinite Author Topic (IAT) model to resolve this issue. Instead of assigning a discrete probability on fixed number of topics, we use a stochastic process to determine the number of topics from the data itself. To be specific, we extend a gamma-negative binomial process to three levels in order to capture the author-document-keyword hierarchical structure. Furthermore, each document is assigned a mixed gamma process that accounts for the multi-author's contribution towards this document. An efficient Gibbs sampling inference algorithm with each conditional distribution being closed-form is developed for the IAT model. Experiments on several real-world datasets show the capabilities of our IAT model to learn the hidden topics, authors' interests on these topics and the number of topics simultaneously.Comment: 10 pages, 5 figures, submitted to KDD conferenc

arXiv.org e-Print Archive

Crossref

OPUS - University of Technology Sydney

A Bayesian nonparametric model for multi-label learning

Author: Lu J
Luo X
Xu RYD
Xuan J
Zhang G
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2017
Field of study

© 2017, The Author(s). Multi-label learning has become a significant learning paradigm in the past few years due to its broad application scenarios and the ever-increasing number of techniques developed by researchers in this area. Among existing state-of-the-art works, generative statistical models are characterized by their good generalization ability and robustness on large number of labels through learning a low-dimensional label embedding. However, one issue of this branch of models is that the number of dimensions needs to be fixed in advance, which is difficult and inappropriate in many real-world settings. In this paper, we propose a Bayesian nonparametric model to resolve this issue. More specifically, we extend a Gamma-negative binomial process to three levels in order to capture the label-instance-feature structure. Furthermore, a mixing strategy for Gamma processes is designed to account for the multiple labels of an instance. The mixed process also leads to a difficulty in model inference, so an efficient Gibbs sampling inference algorithm is then developed to resolve this difficulty. Experiments on several real-world datasets show the performance of the proposed model on multi-label learning tasks, comparing with three state-of-the-art models from the literature

OPUS - University of Technology Sydney

Leveraging Node Attributes for Incomplete Relational Data

Author: Buntine Wray
Du Lan
Zhao He
Publication venue
Publication date: 01/01/2017
Field of study

Relational data are usually highly incomplete in practice, which inspires us to leverage side information to improve the performance of community detection and link prediction. This paper presents a Bayesian probabilistic approach that incorporates various kinds of node attributes encoded in binary form in relational models with Poisson likelihood. Our method works flexibly with both directed and undirected relational networks. The inference can be done by efficient Gibbs sampling which leverages sparsity of both networks and node attributes. Extensive experiments show that our models achieve the state-of-the-art link prediction results, especially with highly incomplete relational data.Comment: Appearing in ICML 201

arXiv.org e-Print Archive

Monash University Research Portal

Compound Markov counting processes and their applications to modeling infinitesimally over-dispersed systems

Author: Andrieu
Applebaum
Bretó
Brown
Brémaud
Carles Bretó
Cox
Cox
Daley
de Quadros
Economou
Edward L. Ionides
Fan
Feller
Ferrari
Gillespie
Gillespie
Gillespie
Greenwood
Haseltine
He
Hjort
Ionides
Ionides
Karlin
Keeling
King
Klemm
Marion
McCullagh
Sato
Shumway
Snyder
Swishchuk
Varughese
Örmeci
Publication venue: 'Elsevier BV'
Publication date: 28/02/2010
Field of study

We propose an infinitesimal dispersion index for Markov counting processes. We show that, under standard moment existence conditions, a process is infinitesimally (over-) equi-dispersed if, and only if, it is simple (compound), i.e. it increases in jumps of one (or more) unit(s), even though infinitesimally equi-dispersed processes might be under-, equi- or over-dispersed using previously studied indices. Compound processes arise, for example, when introducing continuous-time white noise to the rates of simple processes resulting in Levy-driven SDEs. We construct multivariate infinitesimally over-dispersed compartment models and queuing networks, suitable for applications where moment constraints inherent to simple processes do not hold.Comment: 26 page

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

Universidad Carlos III de Madrid e-Archivo