Variable selection for correlated data in high dimension using decorrelation methods

Causeur, David; Friguet, Chloé; Perthame, Emeline; Sheu, Ching-Fan

Variable selection for correlated data in high dimension using decorrelation methods

Authors: David Causeur
Chloé Friguet
Emeline Perthame
Ching-Fan Sheu
Publication date: 7 April 2016
Publisher: HAL CCSD

Abstract

International audienceThe analysis of high throughput data has renewed the statistical methodology for feature selection. Such data are both characterized by their high dimension and their heterogeneity, as the true signal and several confusing factors are often observed at the same time. In such a framework, the usual statistical approaches are questioned and can lead to misleading decisions as they are initially designed under independence assumption among variables. In this talk, I will present some improvements of variable selection methods in regression and supervised classification issues, by accounting for the dependence between selection statistics. The methods proposed in this talk are based on a factor model of covariates, which assumes that variables are conditionally independent given a vector of latent variables. During this talk, I will illustrate the impact of dependence on the stability on some usual selection procedures. Next, I will particularly focus on the analysis of event-related potentials data (ERP) which are widely collected in psychological research to determine the time courses of mental events. Such data are characterized by a temporal dependence pattern both strong and complex which can be modeled by the mentioned above factor model

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

INRIA a CCSD electronic archive server

oai:HAL:hal-01310571v1

Last time updated on 18/12/2020