846 research outputs found
Additive models for quantile regression: model selection and confidence bandaids
Additive models for conditional quantile functions provide an attractive framework for nonparametric regression applications focused on features of the response beyond its central tendency. Total variation roughness penalities can be used to control the smoothness of the additive components much as squared Sobelev penalties are used for classical L 2 smoothing splines. We describe a general approach to estimation and inference for additive models of this type. We focus attention primarily on selection of smoothing parameters and on the construction of confidence bands for the nonparametric components. Both pointwise and uniform confidence bands are introduced; the uniform bands are based on the Hotelling (1939) tube approach. Some simulation evidence is presented to evaluate finite sample performance and the methods are also illustrated with an application to modeling childhood malnutrition in India.
A Unified Framework of Constrained Regression
Generalized additive models (GAMs) play an important role in modeling and
understanding complex relationships in modern applied statistics. They allow
for flexible, data-driven estimation of covariate effects. Yet researchers
often have a priori knowledge of certain effects, which might be monotonic or
periodic (cyclic) or should fulfill boundary conditions. We propose a unified
framework to incorporate these constraints for both univariate and bivariate
effect estimates and for varying coefficients. As the framework is based on
component-wise boosting methods, variables can be selected intrinsically, and
effects can be estimated for a wide range of different distributional
assumptions. Bootstrap confidence intervals for the effect estimates are
derived to assess the models. We present three case studies from environmental
sciences to illustrate the proposed seamless modeling framework. All discussed
constrained effect estimates are implemented in the comprehensive R package
mboost for model-based boosting.Comment: This is a preliminary version of the manuscript. The final
publication is available at
http://link.springer.com/article/10.1007/s11222-014-9520-
A mixed model approach for structured hazard regression
The classical Cox proportional hazards model is a benchmark approach to analyze continuous survival times in the presence of covariate information. In a number of applications, there is a need to relax one or more of its inherent assumptions, such as linearity of the predictor or the proportional hazards property. Also, one is often interested in jointly estimating the baseline hazard together with covariate effects or one may wish to add a spatial component for spatially correlated survival data. We propose an extended Cox model, where the (log-)baseline hazard is weakly parameterized using penalized splines and the usual linear predictor is replaced by a structured additive predictor incorporating nonlinear effects of continuous covariates and further time scales, spatial effects, frailty components, and more complex interactions. Inclusion of time-varying coefficients leads to models that relax the proportional hazards assumption. Nonlinear and time-varying effects are modelled through penalized splines, and spatial components are treated as correlated random effects following either a Markov random field or a stationary Gaussian random field. All model components, including smoothing parameters, are specified within a unified framework and are estimated simultaneously based on mixed model methodology. The estimation procedure for such general mixed hazard regression models is derived using penalized likelihood for regression coefficients and (approximate) marginal likelihood for smoothing parameters. Performance of the proposed method is studied through simulation and an application to leukemia survival data in Northwest England
Parametrization and penalties in spline models with an application to survival analysis
In this paper we show how a simple parametrization, built from the definition of cubic
splines, can aid in the implementation and interpretation of penalized spline models, whatever
configuration of knots we choose to use. We call this parametrization value-first derivative
parametrization. We perform Bayesian inference by exploring the natural link between quadratic
penalties and Gaussian priors. However, a full Bayesian analysis seems feasible only for some
penalty functionals. Alternatives include empirical Bayes methods involving model selection
type criteria. The proposed methodology is illustrated by an application to survival analysis
where the usual Cox model is extended to allow for time-varying regression coefficients
Extensions of semiparametric expectile regression
Expectile regression can be seen as an extension of available (mean) regression models as it describes more general properties of the response distribution. This thesis introduces to expectile regression and presents new extensions of existing semiparametric regression models.
The dissertation consists of four central parts. First, the one-to-one-connection between expectiles, the cumulative distribution function (cdf) and quantiles is used to calculate the cdf and quantiles from a fine grid of expectiles. Quantiles-from-expectiles-estimates are introduced and compared with direct quantile estimates regarding e�ciency. Second, a method to estimate non-crossing expectile curves based on splines is developed. Also, the case of clustered or longitudinal observations is handled by introducing random individual components which leads to an extension of mixed models to mixed expectile models. Third, quantiles-from-expectiles-estimates in the framework of unequal probability sampling are proposed.
All methods are implemented and available within the package expectreg via the open source software R. As fourth part, a description of the package expectreg is given at the end of this thesis
Nonparametric Pre-Processing Methods and Inference Tools for Analyzing Time-of-Flight Mass Spectrometry Data
The objective of this paper is to contribute to the methodology available for extracting and analyzing signal content from protein mass spectrometry data. Data from MALDI-TOF or SELDI-TOF spectra require considerable signal pre-processing such as noise removal and baseline level error correction. After removing the noise by an invariant wavelet transform, we develop a background correction method based on penalized spline quantile regression and apply it to MALDI-TOF (matrix assisted laser deabsorbtion time-of-flight) spectra obtained from serum samples. The results show that the wavelet transform technique combined with nonparametric quantile regression can handle all kinds of background and low signal-to-background ratio spectra; it requires no prior knowledge about the spectra composition, no selection of suitable background correction points, and no mathematical assumption of the background distribution. We further present a multi-scale based novel spectra alignment methodology useful in a functional analysis of variance method for identifying proteins that are differentially expressed between different type tissues. Our approaches are compared with several existing approaches in the recent literature and are tested on simulated and some real data. The results indicate that the proposed schemes enable accurate diagnosis based on the over-expression of a small number of identified proteins with high sensitivity
- …