Search CORE

135 research outputs found

Regression modeling on stratified data with the lasso

Author: Ollier Edouard
Viallon Vivian
Publication venue
Publication date: 08/11/2016
Field of study

We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized version of this approach depends on this arbitrary choice. We propose a refined approach that bypasses this arbitrary choice, at almost no additional computational cost. Regarding model selection consistency, our proposal mimics the strategy based on an optimal and covariate-specific choice for the reference stratum. Results from an empirical study confirm that our proposal generally outperforms the basic approach in the identification and description of the interactions. An illustration on gene expression data is provided.Comment: 23 pages, 5 figure

arXiv.org e-Print Archive

HAL-ENS-LYON

Analysis of Testing-Based Forward Model Selection

Author: Kozbur Damian
Publication venue
Publication date: 06/04/2020
Field of study

This paper introduces and analyzes a procedure called Testing-based forward model selection (TBFMS) in linear regression problems. This procedure inductively selects covariates that add predictive power into a working statistical model before estimating a final regression. The criterion for deciding which covariate to include next and when to stop including covariates is derived from a profile of traditional statistical hypothesis tests. This paper proves probabilistic bounds, which depend on the quality of the tests, for prediction error and the number of selected covariates. As an example, the bounds are then specialized to a case with heteroskedastic data, with tests constructed with the help of Huber-Eicker-White standard errors. Under the assumed regularity conditions, these tests lead to estimation convergence rates matching other common high-dimensional estimators including Lasso

arXiv.org e-Print Archive

ZORA

Weighted-Lasso for Structured Network Inference from Time Course Data

Author: Ambroise Christophe
Charbonnier Camille
Chiquet Julien
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 09/12/2009
Field of study

We present a weighted-Lasso method to infer the parameters of a first-order vector auto-regressive model that describes time course expression data generated by directed gene-to-gene regulation networks. These networks are assumed to own a prior internal structure of connectivity which drives the inference method. This prior structure can be either derived from prior biological knowledge or inferred by the method itself. We illustrate the performance of this structure-based penalization both on synthetic data and on two canonical regulatory networks, first yeast cell cycle regulation network by analyzing Spellman et al's dataset and second E. coli S.O.S. DNA repair network by analysing U. Alon's lab data

arXiv.org e-Print Archive

HAL - Normandie Université

HAL Evry

Crossref

HAL Descartes