Search CORE

1,760 research outputs found

Recommended from our members

Generalised Bayesian matrix factorisation models

Author: Mohamed Shakir
Publication venue: University of Cambridge
Publication date: 15/03/2011
Field of study

Factor analysis and related models for probabilistic matrix factorisation are of central importance to the unsupervised analysis of data, with a colourful history more than a century long. Probabilistic models for matrix factorisation allow us to explore the underlying structure in data, and have relevance in a vast number of application areas including collaborative filtering, source separation, missing data imputation, gene expression analysis, information retrieval, computational finance and computer vision, amongst others. This thesis develops generalisations of matrix factorisation models that advance our understanding and enhance the applicability of this important class of models. The generalisation of models for matrix factorisation focuses on three concerns: widening the applicability of latent variable models to the diverse types of data that are currently available; considering alternative structural forms in the underlying representations that are inferred; and including higher order data structures into the matrix factorisation framework. These three issues reflect the reality of modern data analysis and we develop new models that allow for a principled exploration and use of data in these settings. We place emphasis on Bayesian approaches to learning and the advantages that come with the Bayesian methodology. Our port of departure is a generalisation of latent variable models to members of the exponential family of distributions. This generalisation allows for the analysis of data that may be real-valued, binary, counts, non-negative or a heterogeneous set of these data types. The model unifies various existing models and constructs for unsupervised settings, the complementary framework to the generalised linear models in regression. Moving to structural considerations, we develop Bayesian methods for learning sparse latent representations. We define ideas of weakly and strongly sparse vectors and investigate the classes of prior distributions that give rise to these forms of sparsity, namely the scale-mixture of Gaussians and the spike-and-slab distribution. Based on these sparsity favouring priors, we develop and compare methods for sparse matrix factorisation and present the first comparison of these sparse learning approaches. As a second structural consideration, we develop models with the ability to generate correlated binary vectors. Moment-matching is used to allow binary data with specified correlation to be generated, based on dichotomisation of the Gaussian distribution. We then develop a novel and simple method for binary PCA based on Gaussian dichotomisation. The third generalisation considers the extension of matrix factorisation models to multi-dimensional arrays of data that are increasingly prevalent. We develop the first Bayesian model for non-negative tensor factorisation and explore the relationship between this model and the previously described models for matrix factorisation.Supported by a Commonwealth Scholarship awarded by the Commonwealth Scholarship and Fellowship Programme (CSFP) [Award number ZACS-2207-363] Supported by award from the National Research Foundation, South Africa (NRF) [Award number SFH2007072200001

Apollo (Cambridge)

Fast matrix computations for functional additive models

Author: Barthelme Simon
Publication venue
Publication date: 01/01/2014
Field of study

It is common in functional data analysis to look at a set of related functions: a set of learning curves, a set of brain signals, a set of spatial maps, etc. One way to express relatedness is through an additive model, whereby each individual function

g_{i}\left(x\right)

is assumed to be a variation around some shared mean

f(x)

. Gaussian processes provide an elegant way of constructing such additive models, but suffer from computational difficulties arising from the matrix operations that need to be performed. Recently Heersink & Furrer have shown that functional additive model give rise to covariance matrices that have a specific form they called quasi-Kronecker (QK), whose inverses are relatively tractable. We show that under additional assumptions the two-level additive model leads to a class of matrices we call restricted quasi-Kronecker, which enjoy many interesting properties. In particular, we formulate matrix factorisations whose complexity scales only linearly in the number of functions in latent field, an enormous improvement over the cubic scaling of na\"ive approaches. We describe how to leverage the properties of rQK matrices for inference in Latent Gaussian Models

arXiv.org e-Print Archive

CiteSeerX

Hal-Diderot

Link Prediction via Generalized Coupled Tensor Factorisation

Author: Acar Evrim
Cemgil A. Taylan
Ermiş Beyza
Publication venue
Publication date: 01/01/2012
Field of study

This study deals with the missing link prediction problem: the problem of predicting the existence of missing connections between entities of interest. We address link prediction using coupled analysis of relational datasets represented as heterogeneous data, i.e., datasets in the form of matrices and higher-order tensors. We propose to use an approach based on probabilistic interpretation of tensor factorisation models, i.e., Generalised Coupled Tensor Factorisation, which can simultaneously fit a large class of tensor models to higher-order tensors/matrices with com- mon latent factors using different loss functions. Numerical experiments demonstrate that joint analysis of data from multiple sources via coupled factorisation improves the link prediction performance and the selection of right loss function and tensor model is crucial for accurately predicting missing links

arXiv.org e-Print Archive

Copenhagen University Research Information System

Assessing the Relation between Equity Risk Premium and Macroeconomic Volatilities in the UK

Author: Peter Spencer
Renatas Kizys
Publication venue
Publication date
Field of study

This paper uses the exponential generalised heteroscedasticity model-in-mean (EGARCH- M) to analyse the relationship between the equity risk premium and macroeconomic volatility. This premium depends upon conditional volatility, which is significantly affected by the long bond yield, acting as a proxy for the underlying rate of inflation.Asset pricing, Risk premium, Macroeconomic volatility, Stochastic discount factor model, Multivariate EGARCH-M model

Research Papers in Economics

A Review on Joint Models in Biometrical Research

Author: Augustin Thomas
Daumer Martin
Heumann Christian
Neuhaus Anneke
Publication venue
Publication date: 01/01/2006
Field of study

In some fields of biometrical research joint modelling of longitudinal measures and event time data has become very popular. This article reviews the work in that area of recent fruitful research by classifying approaches on joint models in three categories: approaches with focus on serial trends, approaches with focus on event time data and approaches with equal focus on both outcomes. Typically longitudinal measures and event time data are modelled jointly by introducing shared random effects or by considering conditional distributions together with marginal distributions. We present the approaches in an uniform nomenclature, comment on sub-models applied to longitudinal measures and event time data outcomes individually and exemplify applications in biometrical research

Open Access LMU

Efficient Recursions for General Factorisable Models

Author: Pettitt Tony
Reeves Robert
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2004
Field of study

Let n S-valued categorical variables be jointly distributed according to a distribution known only up to an unknown normalising constant. For an unnormalised joint likelihood expressible as a product of factors, we give an algebraic recursion which can be used for computing the normalising constant and other summations. A saving in computation is achieved when each factor contains a lagged subset of the components combining in the joint distribution, with maximum computational efficiency as the subsets attain their minimum size. If each subset contains at most r+1 of the n components in the joint distribution, we term this a lag-r model, whose normalising constant can be computed using a forward recursion in O(Sr+1) computations, as opposed to O(Sn) for the direct computation. We show how a lag-r model represents a Markov random field and allows a neighbourhood structure to be related to the unnormalised joint likelihood. We illustrate the method by showing how the normalising constant of the Ising or autologistic model can be computed

Queensland University of Technology ePrints Archive

Likelihood-based inference for correlated diffusions

Author: Dellaportas Petros
Kalogeropoulos Konstantinos
Roberts Gareth O.
Publication venue
Publication date: 01/01/2007
Field of study

We address the problem of likelihood based inference for correlated diffusion processes using Markov chain Monte Carlo (MCMC) techniques. Such a task presents two interesting problems. First, the construction of the MCMC scheme should ensure that the correlation coefficients are updated subject to the positive definite constraints of the diffusion matrix. Second, a diffusion may only be observed at a finite set of points and the marginal likelihood for the parameters based on these observations is generally not available. We overcome the first issue by using the Cholesky factorisation on the diffusion matrix. To deal with the likelihood unavailability, we generalise the data augmentation framework of Roberts and Stramer (2001 Biometrika 88(3):603-621) to d-dimensional correlated diffusions including multivariate stochastic volatility models. Our methodology is illustrated through simulation based experiments and with daily EUR /USD, GBP/USD rates together with their implied volatilities

arXiv.org e-Print Archive

Munich RePEc Personal Archive