Search CORE

27 research outputs found

Flexible sampling of discrete data correlations without the marginal distributions

Author: Kalaitzis Alfredo
Silva Ricardo
Publication venue
Publication date: 01/01/2013
Field of study

Learning the joint dependence of discrete variables is a fundamental problem in machine learning, with many applications including prediction, clustering and dimensionality reduction. More recently, the framework of copula modeling has gained popularity due to its modular parametrization of joint distributions. Among other properties, copulas provide a recipe for combining flexible models for univariate marginal distributions with parametric families suitable for potentially high dimensional dependence structures. More radically, the extended rank likelihood approach of Hoff (2007) bypasses learning marginal models completely when such information is ancillary to the learning task at hand as in, e.g., standard dimensionality reduction problems or copula parameter estimation. The main idea is to represent data by their observable rank statistics, ignoring any other information from the marginals. Inference is typically done in a Bayesian framework with Gaussian copulas, and it is complicated by the fact this implies sampling within a space where the number of constraints increases quadratically with the number of data points. The result is slow mixing when using off-the-shelf Gibbs sampling. We present an efficient algorithm based on recent advances on constrained Hamiltonian Markov chain Monte Carlo that is simple to implement and does not require paying for a quadratic cost in sample size.Comment: An overhauled version of the experimental section moved to the main paper. Old experimental section moved to supplementary materia

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery

Residual Component Analysis

Author: Kalaitzis Alfredo A
Lawrence Neil D
Publication venue: Proceedings of the 29th International Coference on International Conference on Machine Learning
Publication date: 21/06/2011
Field of study

Probabilistic principal component analysis (PPCA) seeks a low dimensional representation of a data set in the presence of independent spherical Gaussian noise, Sigma = (sigma^2)*I. The maximum likelihood solution for the model is an eigenvalue problem on the sample covariance matrix. In this paper we consider the situation where the data variance is already partially explained by other factors, e.g. covariates of interest, or temporal correlations leaving some residual variance. We decompose the residual variance into its components through a generalized eigenvalue problem, which we call residual component analysis (RCA). We show that canonical covariates analysis (CCA) is a special case of our algorithm and explore a range of new algorithms that arise from the framework. We illustrate the ideas on a gene expression time series data set and the recovery of human pose from silhouette

arXiv.org e-Print Archive

Apollo (Cambridge)

Probabilistic Super-Resolution of Solar Magnetograms: Generating Many Explanations and Measuring Uncertainties

Author: Deudon M. (Michel)
Gal Y. (Yarin)
Gitiaux X. (Xavier)
Gunes Baydin A. (Atilim)
Jungbluth A. (Anna)
Kalaitzis A. (Alfredo)
Maloney S.A. (Shane)
Muñoz-Jaramillo A. (Andrés)
Shneider C. (Carl)
Wright P.
Publication venue
Publication date: 04/11/2019
Field of study

Machine learning techniques have been successfully applied to super-resolution tasks on natural images where visually pleasing results are sufficient. However in many scientific domains this is not adequate and estimations of errors and uncertainties are crucial. To address this issue we propose a Bayesian framework that decomposes uncertainties into epistemic and aleatoric uncertainties. We test the validity of our approach by super-resolving images of the Sun's magnetic field and by generating maps measuring the range of possible high resolution explanations compatible with a given low resolution magnetogram

arXiv.org e-Print Archive

CWI's Institutional Repository