Search CORE

22,822 research outputs found

Probabilistic Auto-Associative Models and Semi-Linear PCA

Author: Iovleff Serge
Publication venue
Publication date: 20/09/2012
Field of study

Auto-Associative models cover a large class of methods used in data analysis. In this paper, we describe the generals properties of these models when the projection component is linear and we propose and test an easy to implement Probabilistic Semi-Linear Auto- Associative model in a Gaussian setting. We show it is a generalization of the PCA model to the semi-linear case. Numerical experiments on simulated datasets and a real astronomical application highlight the interest of this approac

arXiv.org e-Print Archive

CiteSeerX

Informative Data Projections: A Framework and Two Examples

Author: De Bie Tijl
Kang Bo
Lijffijt Jefrey
Santos-Rodriguez Raul
Publication venue
Publication date: 01/01/2015
Field of study

Methods for Projection Pursuit aim to facilitate the visual exploration of high-dimensional data by identifying interesting low-dimensional projections. A major challenge is the design of a suitable quality metric of projections, commonly referred to as the projection index, to be maximized by the Projection Pursuit algorithm. In this paper, we introduce a new information-theoretic strategy for tackling this problem, based on quantifying the amount of information the projection conveys to a user given their prior beliefs about the data. The resulting projection index is a subjective quantity, explicitly dependent on the intended user. As a useful illustration, we developed this idea for two particular kinds of prior beliefs. The first kind leads to PCA (Principal Component Analysis), shining new light on when PCA is (not) appropriate. The second kind leads to a novel projection index, the maximization of which can be regarded as a robust variant of PCA. We show how this projection index, though non-convex, can be effectively maximized using a modified power method as well as using a semidefinite programming relaxation. The usefulness of this new projection index is demonstrated in comparative empirical experiments against PCA and a popular Projection Pursuit method

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Explore Bristol Research

SIDE : a web app for interactive visual data exploration with subjective feedback

Author: De Bie Tijl
Kang Bo
Lijffijt Jefrey
Puolamäki Kai
Publication venue
Publication date: 01/01/2016
Field of study

Ghent University Academic Bibliography

MaxSkew and MultiSkew: Two R Packages for Detecting, Measuring and Removing Multivariate Skewness

Author: Franceschini Cinzia
Loperfido Nicola
Publication venue
Publication date: 25/03/2019
Field of study

Skewness plays a relevant role in several multivariate statistical techniques. Sometimes it is used to recover data features, as in cluster analysis. In other circumstances, skewness impairs the performances of statistical methods, as in the Hotelling's one-sample test. In both cases, there is the need to check the symmetry of the underlying distribution, either by visual inspection or by formal testing. The R packages MaxSkew and MultiSkew address these issues by measuring, testing and removing skewness from multivariate data. Skewness is assessed by the third multivariate cumulant and its functions. The hypothesis of symmetry is tested either nonparametrically, with the bootstrap, or parametrically, under the normality assumption. Skewness is removed or at least alleviated by projecting the data onto appropriate linear subspaces. Usages of MaxSkew and MultiSkew are illustrated with the Iris dataset

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Urbino