249 research outputs found

    Adaptive Fused LASSO in Grouped Quantile Regression

    Full text link
    This paper considers quantile model with grouped explanatory variables. In order to have the sparsity of the parameter groups but also the sparsity between two successive groups of variables, we propose and study an adaptive fused group LASSO quantile estimator. The number of variable groups can be fixed or divergent. We find the convergence rate under classical assumptions and we show that the proposed estimator satisfies the oracle properties

    Feature selection guided by structural information

    Get PDF
    In generalized linear regression problems with an abundant number of features, lasso-type regularization which imposes an 1\ell^1-constraint on the regression coefficients has become a widely established technique. Deficiencies of the lasso in certain scenarios, notably strongly correlated design, were unmasked when Zou and Hastie [J. Roy. Statist. Soc. Ser. B 67 (2005) 301--320] introduced the elastic net. In this paper we propose to extend the elastic net by admitting general nonnegative quadratic constraints as a second form of regularization. The generalized ridge-type constraint will typically make use of the known association structure of features, for example, by using temporal- or spatial closeness. We study properties of the resulting "structured elastic net" regression estimation procedure, including basic asymptotics and the issue of model selection consistency. In this vein, we provide an analog to the so-called "irrepresentable condition" which holds for the lasso. Moreover, we outline algorithmic solutions for the structured elastic net within the generalized linear model family. The rationale and the performance of our approach is illustrated by means of simulated and real world data, with a focus on signal regression.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS302 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Locally Adaptive Shrinkage Priors for Trends and Breaks in Count Time Series

    Full text link
    Non-stationary count time series characterized by features such as abrupt changes and fluctuations about the trend arise in many scientific domains including biophysics, ecology, energy, epidemiology, and social science domains. Current approaches for integer-valued time series lack the flexibility to capture local transient features while more flexible models for continuous data types are inadequate for universal applications to integer-valued responses such as settings with small counts. We present a modeling framework, the negative binomial Bayesian trend filter (NB-BTF), that offers an adaptive model-based solution to capturing multiscale features with valid integer-valued inference for trend filtering. The framework is a hierarchical Bayesian model with a dynamic global-local shrinkage process. The flexibility of the global-local process allows for the necessary local regularization while the temporal dependence induces a locally smooth trend. In simulation, the NB-BTF outperforms a number of alternative trend filtering methods. Then, we demonstrate the method on weekly power outage frequency in Massachusetts townships. Power outage frequency is characterized by a nominal low level with occasional spikes. These illustrations show the estimation of a smooth, non-stationary trend with adequate uncertainty quantification.Comment: 31 pages, 6 figure

    Spatiotemporal modeling of hydrological return levels: A quantile regression approach

    Get PDF
    Extreme river flows can lead to inundation of floodplains, with consequent impacts for society, the environment and the economy. Extreme flows are inherently diffcult to model being infrequent, irregularly spaced and affected by non-stationary climatic controls. To identify patterns in extreme flows a quantile regression approach can be used. This paper introduces a new framework for spatio-temporal quantile regression modelling, where the regression model is built as an additive model that includes smooth functions of time and space, as well as space-time interaction effects. The model exploits the exibility that P-splines offer and can be easily extended to incorporate potential covariates. We propose to estimate model parameters using a penalized least squares regression approach as an alternative to linear programming methods, classically used in quantile parameter estimation. The model is illustrated on a data set of flows in rivers across Scotland

    Penalized regression for discrete structures

    Get PDF
    Penalisierte Regressionsmodelle stellen eine Möglichkeit dar die Selektion von Kovariablen in die Schätzung eines Modells zu integrieren. Penalisierte Ansätze eignen sich insbesondere dafür, komplexen Strukturen in den Kovariablen eines Modells zu berücksichtigen. Diese Arbeit beschäftigt sich mit verschiedenen Penalisierungsansätzen für diskrete Strukturen, wobei der Begriff "diskrete Struktur" in dieser Arbeit alle Arten von kategorialen Einflussgrößen, von effekt-modifizierenden, kategorialen Einflussgrößen sowie von gruppenspezifischen Effekten in hierarchisch strukturierten Daten bezeichnet. Ihnen ist gemein, dass sie zu einer verhältnismäßig großen Anzahl an zu schätzenden Koeffizienten führen können. Deswegen besteht ein besonderes Interesse daran zu erfahren, welche Kategorien einer Einflussgröße die Zielgröße beeinflussen, und welche Kategorien unterschiedliche beziehungsweise ähnliche Effekte auf die Zielgröße haben. Kategorien mit ähnlichen Effekten können beispielsweise durch fused Lasso Penalties identifiziert werden. Jedoch beschränken sich einige, bestehende Ansätze auf das lineare Modell. Die vorliegende Arbeit überträgt diese Ansätze auf die Klasse der generalisierten linearen Regressionsmodelle. Das beinhaltet computationale wie theoretische Aspekte. Konkret wird eine fused Lasso Penalty für effekt-modifizierende kategoriale Einflussgrößen in generalisierten linearen Regressionsmodellen vorgeschlagen. Sie ermöglicht es, Einflussgrößen zu selektieren und Kategorien einer Einflussgröße zu fusionieren. Gruppenspezifische Effekte, die die Heterogenität in hierarchisch strukturierten Daten berücksichtigen, sind ein Spezialfall einer solchen effekt-modifizierenden, kategorialen Größe. Hier bietet der penalisierte Ansatz zwei wesentliche Vorteile: (i) Im Gegensatz zu gemischten Modellen, die stärkere Annahmen treffen, kann der Grad der Heterogenität sehr leicht reduziert werden. (ii) Die Schätzung ist effizienter als im unpenalisierten Ansatz. In orthonormalen Settings können Fused Lasso Penalties konzeptionelle Nachteile haben. Als Alternative wird eine L0 Penalty für diskrete Strukturen in generalisierten linearen Regressionsmodellen diskutiert, wobei die sogenannte L0 "Norm" eine Indikatorfunktion für Argumente ungleich Null bezeichnet. Als Penalty ist diese Funktion so interessant wie anspruchsvoll. Betrachtet man eine Approximation der L0 Norm als Verlustfunktion wird im Grenzwert der bedingte Modus einer Zielgröße geschätzt.Penalties are an established approach to stabilize estimation and to select predictors in regression models. Penalties are especially useful when discrete structures matter. In this thesis, the term "discrete structure" subsumes all kinds of categorical effects, categorical effect modifiers and group-specific effects for hierarchical settings. Discrete structures can be challenging as they need to be coded, and as they can result in a huge number of coefficients. Moreover, users are interested in which levels of a discrete covariate are to be distinguished with respect to the response of a model, or in whether some levels have the same impact on the response. One wants to detect non-influential coefficients and to allow for coefficients with the same estimates. That requires carefully tailored penalization as, for example, provided by different variations of the fused Lasso. However, the reach of many existing methods is restricted as mostly, the response is assumed to be Gaussian. In this thesis, some efforts to extend these approaches are made. The focus is on appropriate penalization strategies for discrete structures in generalized linear models (GLMs). Lasso-type penalties in GLMs require special estimation procedures. In a first step, an existing Fisher scoring algorithm, that allows to combine different types of penalties in one model, is generalized. This algorithm provides the computational basis for the subsequent topics. In a second step, varying coefficients with categorical effect modifiers are considered. Existing methodology for linear models is extended to GLMs. In hierarchical settings, fixed effects models, which are also called group-specific models and which are a special case of categorical effect modifiers, are a common choice to account for the heterogeneity in the data. Applying the proposed penalization techniques for categorical effect modifiers to hierarchical settings offers some benefits: In comparison to mixed models, the approach is able to fuse second level units easily. In comparison to unpenalized group-specific models, efficiency is gained. In a third step, fused Lasso-type penalties for discrete structures are considered in more detail. Especially in orthonormal settings, Lasso-type penalties for categorical effects have some drawbacks regarding the clustering of the coefficients. To overcome these problems, an L0 penalty for discrete structures is proposed. Again, computational issues are met by a quadratic approximation. This approximation is not only useful in the context of penalized regression for discrete structures, but also when an approximation of the L0 norm is employed as a loss function. That is, it is useful for regression models that approximate the conditional mode of a response. For linear predictors, a close link to kernel methods allows to show that the proposed estimator is consistent and asymptotically normal. Regression models with semiparametric predictors are possible

    Selected Problems for High-Dimensional Data - Quantile and Errors-in-Variables Regressions.

    Full text link
    This dissertation addresses two problems. First, we study joint quantile regression at multiple quantile levels with high dimensional covariates. Variable selection performed at individual quantile levels may lack stability across neighboring quantiles, making it difficult to understand and to interpret the impact of a given covariate on conditional quantile functions. We propose a Dantzig-type penalization method for sparse model selection at each quantile level which at the same time aims to shrink differences of the selected models across neighboring quantiles. We show model selection consistency, and investigate stability of the selected models across quantiles. In the second part of the thesis, we consider the class of covariance models that can be expressed as a Kronecker sum. Taking advantage of our theoretical analysis on matrix decomposition, we demonstrate that our methodology yields computationally efficient and statistically convergent estimates. We show that this decomposition may correspond to a representation of the data as signal plus additive noise. This may in turn be used in a regression framework to accommodate measurement error. We assess performance using simulations and illustrate the methods using a study of hawkmoth flight control (Sponberg et al. 2015). We find that the decomposition successfully isolates signal and noise, and reveals a stronger neural encoding relationship than otherwise would be obtained.PhDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133340/1/pseyoung_1.pd
    corecore