Search CORE

914 research outputs found

Beyond Support in Two-Stage Variable Selection

Author: Ambroise Christophe
Bécu Jean-Michel
Dalmasso Cyril
Grandvalet Yves
Publication venue
Publication date: 24/04/2015
Field of study

Numerous variable selection methods rely on a two-stage procedure, where a sparsity-inducing penalty is used in the first stage to predict the support, which is then conveyed to the second stage for estimation or inference purposes. In this framework, the first stage screens variables to find a set of possibly relevant variables and the second stage operates on this set of candidate variables, to improve estimation accuracy or to assess the uncertainty associated to the selection of variables. We advocate that more information can be conveyed from the first stage to the second one: we use the magnitude of the coefficients estimated in the first stage to define an adaptive penalty that is applied at the second stage. We give two examples of procedures that can benefit from the proposed transfer of information, in estimation and inference problems respectively. Extensive simulations demonstrate that this transfer is particularly efficient when each stage operates on distinct subsamples. This separation plays a crucial role for the computation of calibrated p-values, allowing to control the False Discovery Rate. In this setup, the proposed transfer results in sensitivity gains ranging from 50% to 100% compared to state-of-the-art

arXiv.org e-Print Archive

HAL Evry

Crossref

HAL Descartes

Banking the unbanked: the Mzansi intervention in South Africa:

Author: Annim Samuel Kobina
Arun Thankom Gopinath
Kostov Phillip
Publication venue: 'Emerald'
Publication date: 01/01/2014
Field of study

Purpose This paper aims to understand household’s latent behaviour decision making in accessing financial services. In this analysis we look at the determinants of the choice of the pre-entry Mzansi account by consumers in South Africa. Design/methodology/approach We use 102 variables, grouped in the following categories: basic literacy, understanding financial terms, targets for financial advice, desired financial education and financial perception. Employing a computationally efficient variable selection algorithm we study which variables can satisfactorily explain the choice of a Mzansi account. Findings The Mzansi intervention is appealing to individuals with basic but insufficient financial education. Aspirations seem to be very influential in revealing the choice of financial services and to this end Mzansi is perceived as a pre-entry account not meeting the aspirations of individuals aiming to climb up the financial services ladder. We find that Mzansi holders view the account mainly as a vehicle for receiving payments, but on the other hand are debt-averse and inclined to save. Hence although there is at present no concrete evidence that the Mzansi intervention increases access to finance via diversification (i.e. by recruiting customers into higher level accounts and services) our analysis shows that this is very likely to be the case. Originality/value The issue of demand side constraints on access to finance have been largely ignored in the theoretical and empirical literature. This paper undertakes some preliminary steps in addressing this gap

University of Essex Research Repository

CLoK

Crossref

Sparsity with sign-coherent groups of variables via the cooperative-Lasso

Author: Charbonnier Camille
Chiquet Julien
Grandvalet Yves
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2012
Field of study

We consider the problems of estimation and selection of parameters endowed with a known group structure, when the groups are assumed to be sign-coherent, that is, gathering either nonnegative, nonpositive or null parameters. To tackle this problem, we propose the cooperative-Lasso penalty. We derive the optimality conditions defining the cooperative-Lasso estimate for generalized linear models, and propose an efficient active set algorithm suited to high-dimensional problems. We study the asymptotic consistency of the estimator in the linear regression setup and derive its irrepresentable conditions, which are milder than the ones of the group-Lasso regarding the matching of groups with the sparsity pattern of the true parameters. We also address the problem of model selection in linear regression by deriving an approximation of the degrees of freedom of the cooperative-Lasso estimator. Simulations comparing the proposed estimator to the group and sparse group-Lasso comply with our theoretical results, showing consistent improvements in support recovery for sign-coherent groups. We finally propose two examples illustrating the wide applicability of the cooperative-Lasso: first to the processing of ordinal variables, where the penalty acts as a monotonicity prior; second to the processing of genomic data, where the set of differentially expressed probes is enriched by incorporating all the probes of the microarray that are related to the corresponding genes.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS520 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

A General Framework for Fast Stagewise Algorithms

Author: Tibshirani Ryan J.
Publication venue
Publication date: 13/06/2015
Field of study

Forward stagewise regression follows a very simple strategy for constructing a sequence of sparse regression estimates: it starts with all coefficients equal to zero, and iteratively updates the coefficient (by a small amount

\epsilon

) of the variable that achieves the maximal absolute inner product with the current residual. This procedure has an interesting connection to the lasso: under some conditions, it is known that the sequence of forward stagewise estimates exactly coincides with the lasso path, as the step size

\epsilon

goes to zero. Furthermore, essentially the same equivalence holds outside of least squares regression, with the minimization of a differentiable convex loss function subject to an

\ell_1

norm constraint (the stagewise algorithm now updates the coefficient corresponding to the maximal absolute component of the gradient). Even when they do not match their

\ell_1

-constrained analogues, stagewise estimates provide a useful approximation, and are computationally appealing. Their success in sparse modeling motivates the question: can a simple, effective strategy like forward stagewise be applied more broadly in other regularization settings, beyond the

\ell_1

norm and sparsity? The current paper is an attempt to do just this. We present a general framework for stagewise estimation, which yields fast algorithms for problems such as group-structured learning, matrix completion, image denoising, and more.Comment: 56 pages, 15 figure

arXiv.org e-Print Archive

CiteSeerX

Inferring species interaction networks from species abundance data: a comparative evaluation of various statistical and machine learning methods

Author: Albert
Ali Faisal
Alstroem
Araujo
Augustin
Barabási
Batjes
Beale
Beisner
Blüthgen
Chickering
Cohen
Colin M. Beale
Dale
Davis
de Silva
Dirk Husmeier
Dondelinger
Dunne
Edwards
Elle
Faisal
Frank Dondelinger
Friedman
Friedman
Garcia
Gaston
Geiger
Grafen
Grandvalet
Grzegorczyk
Grzegorczyk
Grzegorczyk
Hagemeijer
Hartemink
Heckerman
Heckerman
Henneman
Holt
Huntley
Husmeier
Ings
La Sorte
Lande
Legendre
Lennon
Losos
Luce
MacKay
MacKay
Maddison
Madigan
Memmott
Memmott
Milo
Murray
Needham
New
Opgen-Rhein
Park
Prentice
Proulx
Rogers
Rolando
Sachs
Schmitz
Schäfer
Schäfer
Schäfer
Sinclair
Smith
Snow
Thomas
Thuiller
Tibshirani
Tipping
Tipping
Tirelli
Tong
Valiente
van Someren
van Veen
Vázquez
Watts
Watts
Werhli
Werhli
Werner
Williams
Williams
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Crossref

Enlighten

Time-Varying Parameters as Ridge Regressions

Author: Coulombe Philippe Goulet
Publication venue
Publication date: 28/03/2021
Field of study

Time-varying parameters (TVPs) models are frequently used in economics to model structural change. I show that they are in fact ridge regressions. Instantly, this makes computations, tuning, and implementation much easier than in the state-space paradigm. Among other things, solving the equivalent dual ridge problem is computationally very fast even in high dimensions, and the crucial "amount of time variation" is tuned by cross-validation. Evolving volatility is dealt with using a two-step ridge regression. I consider extensions that incorporate sparsity (the algorithm selects which parameters vary and which do not) and reduced-rank restrictions (variation is tied to a factor model). To demonstrate the usefulness of the approach, I use it to study the evolution of monetary policy in Canada. The application requires the estimation of about 4600 TVPs, a task well within the reach of the new method

arXiv.org e-Print Archive

Recovering edges in ill-posed inverse problems: optimality of curvelet frames

Author: Candès Emmanuel J.
Donoho David L.
Publication venue
Publication date: 01/06/2002
Field of study

We consider a model problem of recovering a function

f(x_1,x_2)

from noisy Radon data. The function

f

to be recovered is assumed smooth apart from a discontinuity along a

C^2

curve, that is, an edge. We use the continuum white-noise model, with noise level

\varepsilon

. Traditional linear methods for solving such inverse problems behave poorly in the presence of edges. Qualitatively, the reconstructions are blurred near the edges; quantitatively, they give in our model mean squared errors (MSEs) that tend to zero with noise level

\varepsilon

only as

O(\varepsilon^{1/2})

\varepsilon\to 0

. A recent innovation--nonlinear shrinkage in the wavelet domain--visually improves edge sharpness and improves MSE convergence to

O(\varepsilon^{2/3})

. However, as we show here, this rate is not optimal. In fact, essentially optimal performance is obtained by deploying the recently-introduced tight frames of curvelets in this setting. Curvelets are smooth, highly anisotropic elements ideally suited for detecting and synthesizing curved edges. To deploy them in the Radon setting, we construct a curvelet-based biorthogonal decomposition of the Radon operator and build "curvelet shrinkage" estimators based on thresholding of the noisy curvelet coefficients. In effect, the estimator detects edges at certain locations and orientations in the Radon domain and automatically synthesizes edges at corresponding locations and directions in the original domain. We prove that the curvelet shrinkage can be tuned so that the estimator will attain, within logarithmic factors, the MSE

O(\varepsilon^{4/5})

as noise level

\varepsilon\to 0

. This rate of convergence holds uniformly over a class of functions which are

C^2

except for discontinuities along

C^2

curves, and (except for log terms) is the minimax rate for that class. Our approach is an instance of a general strategy which should apply in other inverse problems; we sketch a deconvolution example

Caltech Authors