Search CORE

11,278 research outputs found

Dimension Reduction by Mutual Information Discriminant Analysis

Author: Shadvar Ali
Publication venue
Publication date: 01/01/2012
Field of study

In the past few decades, researchers have proposed many discriminant analysis (DA) algorithms for the study of high-dimensional data in a variety of problems. Most DA algorithms for feature extraction are based on transformations that simultaneously maximize the between-class scatter and minimize the withinclass scatter matrices. This paper presents a novel DA algorithm for feature extraction using mutual information (MI). However, it is not always easy to obtain an accurate estimation for high-dimensional MI. In this paper, we propose an efficient method for feature extraction that is based on one-dimensional MI estimations. We will refer to this algorithm as mutual information discriminant analysis (MIDA). The performance of this proposed method was evaluated using UCI databases. The results indicate that MIDA provides robust performance over different data sets with different characteristics and that MIDA always performs better than, or at least comparable to, the best performing algorithms.Comment: 13pages, 3 tables, International Journal of Artificial Intelligence & Application

arXiv.org e-Print Archive

CiteSeerX

Optimal projection of observations in a Bayesian setting

Author: Giraldi Loïc
Hoteit Ibrahim
Knio Omar M.
Maître Olivier P. Le
Publication venue
Publication date: 12/02/2018
Field of study

Optimal dimensionality reduction methods are proposed for the Bayesian inference of a Gaussian linear model with additive noise in presence of overabundant data. Three different optimal projections of the observations are proposed based on information theory: the projection that minimizes the Kullback-Leibler divergence between the posterior distributions of the original and the projected models, the one that minimizes the expected Kullback-Leibler divergence between the same distributions, and the one that maximizes the mutual information between the parameter of interest and the projected observations. The first two optimization problems are formulated as the determination of an optimal subspace and therefore the solution is computed using Riemannian optimization algorithms on the Grassmann manifold. Regarding the maximization of the mutual information, it is shown that there exists an optimal subspace that minimizes the entropy of the posterior distribution of the reduced model; a basis of the subspace can be computed as the solution to a generalized eigenvalue problem; an a priori error estimate on the mutual information is available for this particular solution; and that the dimensionality of the subspace to exactly conserve the mutual information between the input and the output of the models is less than the number of parameters to be inferred. Numerical applications to linear and nonlinear models are used to assess the efficiency of the proposed approaches, and to highlight their advantages compared to standard approaches based on the principal component analysis of the observations

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Positive semi-definite embedding for dimensionality reduction and out-of-sample extensions

Author: Aspeel Antoine
Delvenne Jean-Charles
Fanuel Michaël
Suykens Johan A. K.
Publication venue
Publication date: 06/10/2020
Field of study

In machine learning or statistics, it is often desirable to reduce the dimensionality of a sample of data points in a high dimensional space

\mathbb{R}^d

. This paper introduces a dimensionality reduction method where the embedding coordinates are the eigenvectors of a positive semi-definite kernel obtained as the solution of an infinite dimensional analogue of a semi-definite program. This embedding is adaptive and non-linear. A main feature of our approach is the existence of a non-linear out-of-sample extension formula of the embedding coordinates, called a projected Nystr\"om approximation. This extrapolation formula yields an extension of the kernel matrix to a data-dependent Mercer kernel function. Our empirical results indicate that this embedding method is more robust with respect to the influence of outliers, compared with a spectral embedding method.Comment: 16 pages, 5 figures. Improved presentatio

arXiv.org e-Print Archive

HAL Descartes

DIAL UCLouvain

Hal-Diderot