Search CORE

249 research outputs found

Adaptive Fused LASSO in Grouped Quantile Regression

Author: Ciuperca Gabriela
Publication venue
Publication date: 19/07/2016
Field of study

This paper considers quantile model with grouped explanatory variables. In order to have the sparsity of the parameter groups but also the sparsity between two successive groups of variables, we propose and study an adaptive fused group LASSO quantile estimator. The number of variable groups can be fixed or divergent. We find the convergence rate under classical assumptions and we show that the proposed estimator satisfies the oracle properties

arXiv.org e-Print Archive

HAL-UJM

Hal-Diderot

Feature selection guided by structural information

Author: Castell Wolfgang zu
Slawski Martin
Tutz Gerhard
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 09/03/2009
Field of study

In generalized linear regression problems with an abundant number of features, lasso-type regularization which imposes an

\ell^1

-constraint on the regression coefficients has become a widely established technique. Deficiencies of the lasso in certain scenarios, notably strongly correlated design, were unmasked when Zou and Hastie [J. Roy. Statist. Soc. Ser. B 67 (2005) 301--320] introduced the elastic net. In this paper we propose to extend the elastic net by admitting general nonnegative quadratic constraints as a second form of regularization. The generalized ridge-type constraint will typically make use of the known association structure of features, for example, by using temporal- or spatial closeness. We study properties of the resulting "structured elastic net" regression estimation procedure, including basic asymptotics and the issue of model selection consistency. In this vein, we provide an analog to the so-called "irrepresentable condition" which holds for the lasso. Moreover, we outline algorithmic solutions for the structured elastic net within the generalized linear model family. The rationale and the performance of our approach is illustrated by means of simulated and real world data, with a focus on signal regression.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS302 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Open Access LMU

PuSH

Locally Adaptive Shrinkage Priors for Trends and Breaks in Count Time Series

Author: Matteson David S.
Schafer Toryn L. J.
Publication venue
Publication date: 31/08/2023
Field of study

Non-stationary count time series characterized by features such as abrupt changes and fluctuations about the trend arise in many scientific domains including biophysics, ecology, energy, epidemiology, and social science domains. Current approaches for integer-valued time series lack the flexibility to capture local transient features while more flexible models for continuous data types are inadequate for universal applications to integer-valued responses such as settings with small counts. We present a modeling framework, the negative binomial Bayesian trend filter (NB-BTF), that offers an adaptive model-based solution to capturing multiscale features with valid integer-valued inference for trend filtering. The framework is a hierarchical Bayesian model with a dynamic global-local shrinkage process. The flexibility of the global-local process allows for the necessary local regularization while the temporal dependence induces a locally smooth trend. In simulation, the NB-BTF outperforms a number of alternative trend filtering methods. Then, we demonstrate the method on weekly power outage frequency in Massachusetts townships. Power outage frequency is characterized by a nominal low level with occasional spikes. These illustrations show the estimation of a smooth, non-stationary trend with adequate uncertainty quantification.Comment: 31 pages, 6 figure

arXiv.org e-Print Archive

Spatiotemporal modeling of hydrological return levels: A quantile regression approach

Author: Franco-Villoria Maria
Hoey Trevor
Scott Marian
Publication venue: Wiley
Publication date: 01/01/2019
Field of study

Extreme river flows can lead to inundation of floodplains, with consequent impacts for society, the environment and the economy. Extreme flows are inherently diffcult to model being infrequent, irregularly spaced and affected by non-stationary climatic controls. To identify patterns in extreme flows a quantile regression approach can be used. This paper introduces a new framework for spatio-temporal quantile regression modelling, where the regression model is built as an additive model that includes smooth functions of time and space, as well as space-time interaction effects. The model exploits the exibility that P-splines offer and can be easily extended to incorporate potential covariates. We propose to estimate model parameters using a penalized least squares regression approach as an alternative to linear programming methods, classically used in quantile parameter estimation. The model is illustrated on a data set of flows in rivers across Scotland

Enlighten

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Institutional Research Information System University of Turin

Penalized regression for discrete structures

Author: Oelker Margret-Ruth
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 08/01/2015
Field of study

Penalisierte Regressionsmodelle stellen eine Möglichkeit dar die Selektion von Kovariablen in die Schätzung eines Modells zu integrieren. Penalisierte Ansätze eignen sich insbesondere dafür, komplexen Strukturen in den Kovariablen eines Modells zu berücksichtigen. Diese Arbeit beschäftigt sich mit verschiedenen Penalisierungsansätzen für diskrete Strukturen, wobei der Begriff "diskrete Struktur" in dieser Arbeit alle Arten von kategorialen Einflussgrößen, von effekt-modifizierenden, kategorialen Einflussgrößen sowie von gruppenspezifischen Effekten in hierarchisch strukturierten Daten bezeichnet. Ihnen ist gemein, dass sie zu einer verhältnismäßig großen Anzahl an zu schätzenden Koeffizienten führen können. Deswegen besteht ein besonderes Interesse daran zu erfahren, welche Kategorien einer Einflussgröße die Zielgröße beeinflussen, und welche Kategorien unterschiedliche beziehungsweise ähnliche Effekte auf die Zielgröße haben. Kategorien mit ähnlichen Effekten können beispielsweise durch fused Lasso Penalties identifiziert werden. Jedoch beschränken sich einige, bestehende Ansätze auf das lineare Modell. Die vorliegende Arbeit überträgt diese Ansätze auf die Klasse der generalisierten linearen Regressionsmodelle. Das beinhaltet computationale wie theoretische Aspekte. Konkret wird eine fused Lasso Penalty für effekt-modifizierende kategoriale Einflussgrößen in generalisierten linearen Regressionsmodellen vorgeschlagen. Sie ermöglicht es, Einflussgrößen zu selektieren und Kategorien einer Einflussgröße zu fusionieren. Gruppenspezifische Effekte, die die Heterogenität in hierarchisch strukturierten Daten berücksichtigen, sind ein Spezialfall einer solchen effekt-modifizierenden, kategorialen Größe. Hier bietet der penalisierte Ansatz zwei wesentliche Vorteile: (i) Im Gegensatz zu gemischten Modellen, die stärkere Annahmen treffen, kann der Grad der Heterogenität sehr leicht reduziert werden. (ii) Die Schätzung ist effizienter als im unpenalisierten Ansatz. In orthonormalen Settings können Fused Lasso Penalties konzeptionelle Nachteile haben. Als Alternative wird eine L0 Penalty für diskrete Strukturen in generalisierten linearen Regressionsmodellen diskutiert, wobei die sogenannte L0 "Norm" eine Indikatorfunktion für Argumente ungleich Null bezeichnet. Als Penalty ist diese Funktion so interessant wie anspruchsvoll. Betrachtet man eine Approximation der L0 Norm als Verlustfunktion wird im Grenzwert der bedingte Modus einer Zielgröße geschätzt.Penalties are an established approach to stabilize estimation and to select predictors in regression models. Penalties are especially useful when discrete structures matter. In this thesis, the term "discrete structure" subsumes all kinds of categorical effects, categorical effect modifiers and group-specific effects for hierarchical settings. Discrete structures can be challenging as they need to be coded, and as they can result in a huge number of coefficients. Moreover, users are interested in which levels of a discrete covariate are to be distinguished with respect to the response of a model, or in whether some levels have the same impact on the response. One wants to detect non-influential coefficients and to allow for coefficients with the same estimates. That requires carefully tailored penalization as, for example, provided by different variations of the fused Lasso. However, the reach of many existing methods is restricted as mostly, the response is assumed to be Gaussian. In this thesis, some efforts to extend these approaches are made. The focus is on appropriate penalization strategies for discrete structures in generalized linear models (GLMs). Lasso-type penalties in GLMs require special estimation procedures. In a first step, an existing Fisher scoring algorithm, that allows to combine different types of penalties in one model, is generalized. This algorithm provides the computational basis for the subsequent topics. In a second step, varying coefficients with categorical effect modifiers are considered. Existing methodology for linear models is extended to GLMs. In hierarchical settings, fixed effects models, which are also called group-specific models and which are a special case of categorical effect modifiers, are a common choice to account for the heterogeneity in the data. Applying the proposed penalization techniques for categorical effect modifiers to hierarchical settings offers some benefits: In comparison to mixed models, the approach is able to fuse second level units easily. In comparison to unpenalized group-specific models, efficiency is gained. In a third step, fused Lasso-type penalties for discrete structures are considered in more detail. Especially in orthonormal settings, Lasso-type penalties for categorical effects have some drawbacks regarding the clustering of the coefficients. To overcome these problems, an L0 penalty for discrete structures is proposed. Again, computational issues are met by a quadratic approximation. This approximation is not only useful in the context of penalized regression for discrete structures, but also when an approximation of the L0 norm is employed as a loss function. That is, it is useful for regression models that approximate the conditional mode of a response. For linear predictors, a close link to kernel methods allows to show that the proposed estimator is consistent and asymptotically normal. Regression models with semiparametric predictors are possible

Recommended from our members

Prior elicitation and variable selection for bayesian quantile regression

Author: Al-Hamzawi Rahim Jabbar Thaher
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Bayesian subset selection suffers from three important difficulties: assigning priors over model space, assigning priors to all components of the regression coefficients vector given a specific model and Bayesian computational efficiency (Chen et al., 1999). These difficulties become more challenging in Bayesian quantile regression framework when one is interested in assigning priors that depend on different quantile levels. The objective of Bayesian quantile regression (BQR), which is a newly proposed tool, is to deal with unknown parameters and model uncertainty in quantile regression (QR). However, Bayesian subset selection in quantile regression models is usually a difficult issue due to the computational challenges and nonavailability of conjugate prior distributions that are dependent on the quantile level. These challenges are rarely addressed via either penalised likelihood function or stochastic search variable selection (SSVS). These methods typically use symmetric prior distributions for regression coefficients, such as the Gaussian and Laplace, which may be suitable for median regression. However, an extreme quantile regression should have different regression coefficients from the median regression, and thus the priors for quantile regression coefficients should depend on quantiles. This thesis focuses on three challenges: assigning standard quantile dependent prior distributions for the regression coefficients, assigning suitable quantile dependent priors over model space and achieving computational efficiency. The first of these challenges is studied in Chapter 2 in which a quantile dependent prior elicitation scheme is developed. In particular, an extension of the Zellners prior which allows for a conditional conjugate prior and quantile dependent prior on Bayesian quantile regression is proposed. The prior is generalised in Chapter 3 by introducing a ridge parameter to address important challenges that may arise in some applications, such as multicollinearity and overfitting problems. The proposed prior is also used in Chapter 4 for subset selection of the fixed and random coefficients in a linear mixedeffects QR model. In Chapter 5 we specify normal-exponential prior distributions for the regression coefficients which can provide adaptive shrinkage and represent an alternative model to the Bayesian Lasso quantile regression model. For the second challenge, we assign a quantile dependent prior over model space in Chapter 2. The prior is based on the percentage bend correlation which depends on the quantile level. This prior is novel and is used in Bayesian regression for the first time. For the third challenge of computational efficiency, Gibbs samplers are derived and setup to facilitate the computation of the proposed methods. In addition to the three major aforementioned challenges this thesis also addresses other important issues such as the regularisation in quantile regression and selecting both random and fixed effects in mixed quantile regression models

Brunel University Research Archive

Selected Problems for High-Dimensional Data - Quantile and Errors-in-Variables Regressions.

Author: Park Seyoung
Publication venue
Publication date: 01/01/2016
Field of study

This dissertation addresses two problems. First, we study joint quantile regression at multiple quantile levels with high dimensional covariates. Variable selection performed at individual quantile levels may lack stability across neighboring quantiles, making it difficult to understand and to interpret the impact of a given covariate on conditional quantile functions. We propose a Dantzig-type penalization method for sparse model selection at each quantile level which at the same time aims to shrink differences of the selected models across neighboring quantiles. We show model selection consistency, and investigate stability of the selected models across quantiles. In the second part of the thesis, we consider the class of covariance models that can be expressed as a Kronecker sum. Taking advantage of our theoretical analysis on matrix decomposition, we demonstrate that our methodology yields computationally efficient and statistically convergent estimates. We show that this decomposition may correspond to a representation of the data as signal plus additive noise. This may in turn be used in a regression framework to accommodate measurement error. We assess performance using simulations and illustrate the methods using a study of hawkmoth flight control (Sponberg et al. 2015). We find that the decomposition successfully isolates signal and noise, and reveals a stronger neural encoding relationship than otherwise would be obtained.PhDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133340/1/pseyoung_1.pd

Deep Blue Documents at the University of Michigan