1,597 research outputs found

    Partially linear censored quantile regression

    Get PDF
    Censored regression quantile (CRQ) methods provide a powerful and flexible approach to the analysis of censored survival data when standard linear models are felt to be appropriate. In many cases however, greater flexibility is desired to go beyond the usual multiple regression paradigm. One area of common interest is that of partially linear models: one (or more) of the explanatory covariates are assumed to act on the response through a non-linear function. Here the CRQ approach of Portnoy (J Am Stat Assoc 98:1001–1012, 2003) is extended to this partially linear setting. Basic consistency results are presented. A simulation experiment and unemployment example justify the value of the partially linear approach over methods based on the Cox proportional hazards model and on methods not permitting nonlinearity

    Simultaneous interval regression for K-nearest neighbor

    Get PDF
    International audienceIn some regression problems, it may be more reasonable to predict intervals rather than precise values. We are interested in finding intervals which simultaneously for all input instances x ∈X contain a ÎČ proportion of the response values. We name this problem simultaneous interval regression. This is similar to simultaneous tolerance intervals for regression with a high confidence level γ ≈ 1 and several authors have already treated this problem for linear regression. Such intervals could be seen as a form of confidence envelop for the prediction variable given any value of predictor variables in their domain. Tolerance intervals and simultaneous tolerance intervals have not yet been treated for the K-nearest neighbor (KNN) regression method. The goal of this paper is to consider the simultaneous interval regression problem for KNN and this is done without the homoscedasticity assumption. In this scope, we propose a new interval regression method based on KNN which takes advantage of tolerance intervals in order to choose, for each instance, the value of the hyper-parameter K which will be a good trade-off between the precision and the uncertainty due to the limited sample size of the neighborhood around each instance. In the experiment part, our proposed interval construction method is compared with a more conventional interval approximation method on six benchmark regression data sets

    Bayesian lasso binary quantile regression

    Get PDF
    In this paper, a Bayesian hierarchical model for variable selection and estimation in the context of binary quantile regression is proposed. Existing approaches to variable selection in a binary classification context are sensitive to outliers, heteroskedasticity or other anomalies of the latent response. The method proposed in this study overcomes these problems in an attractive and straightforward way. A Laplace likelihood and Laplace priors for the regression parameters are proposed and estimated with Bayesian Markov Chain Monte Carlo. The resulting model is equivalent to the frequentist lasso procedure. A conceptional result is that by doing so, the binary regression model is moved from a Gaussian to a full Laplacian framework without sacrificing much computational efficiency. In addition, an efficient Gibbs sampler to estimate the model parameters is proposed that is superior to the Metropolis algorithm that is used in previous studies on Bayesian binary quantile regression. Both the simulation studies and the real data analysis indicate that the proposed method performs well in comparison to the other methods. Moreover, as the base model is binary quantile regression, a much more detailed insight in the effects of the covariates is provided by the approach. An implementation of the lasso procedure for binary quantile regression models is available in the R-package bayesQR

    Are there asymmetries in the effects of training on the conditional male wage distribution?

    Get PDF
    Recent studies have used quantile regression (QR) techniques to estimate the impact of education on the location, scale and shape of the conditional wage distribution. In our paper we investigate the degree to which work-related training – another important form of human capital – affects the location, scale and shape of the conditional wage distribution. Using the first six waves of the European Community Household Panel, we utilise both ordinary least squares and QR techniques to estimate associations between work-related training and wages for private sector men in ten European Union countries. Our results show that, for the majority of countries, there is a fairly uniform association between training and hourly wages across the conditional wage distribution. However, there are considerable differences across countries in mean associations between training and wages

    Changes in extreme sea-levels in the Baltic Sea

    Get PDF
    In a climate change context, changes in extreme sea-levels rather than changes in the mean are of particular interest from the coastal protection point of view. In this work, extreme sea-levels in the Baltic Sea are investigated based on daily tide gauge records for the period 1916–2005 using the annual block maxima approach. Extreme events are analysed based on the generalised extreme value distribution considering both stationary and time-varying models. The likelihood ratio test is applied to select between stationary and non-stationary models for the maxima and return values are estimated from the final model. As an independent and complementary approach, quantile regression is applied for comparison with the results from the extreme value approach. The rates of change in the uppermost quantiles are in general consistent and most pronounced for the northernmost stations

    Three-dimensional mapping reveals scale-dependent dynamics in biogenic reef habitat structure

    Get PDF
    Habitat structure influences a broad range of ecological interactions and ecosystem functions across biomes. To understand and effectively manage dynamic ecosystems, we need detailed information about habitat properties and how they vary across spatial and temporal scales. Measuring and monitoring variation in three-dimensional (3D) habitat structure has traditionally been challenging, despite recognition of its importance to ecological processes. Modern 3D mapping technologies present opportunities to characterize spatial and temporal variation in habitat structure at a range of ecologically relevant scales. Biogenic reefs are structurally complex and dynamic habitats, in which structure has a pivotal influence on ecosystem biodiversity, function and resilience. For the first time, we characterized spatial and temporal dynamics in the 3D structure of intertidal Sabellaria alveolata biogenic reef across scales. We used drone-derived structure-from-motion photogrammetry and terrestrial laser scanning to characterize reef structural variation at mm-to-cm resolutions at a habitat scale (~35 000 m2) over 1 year, and at a plot scale (2500 m2) over 5 years (2014–2019, 6-month intervals). We found that most of the variation in reef emergence above the substrate, accretion rate and erosion rate was explained by a combination of systematic trends with shore height and positive spatial autocorrelation up to the scale of colonies (1.5 m) or small patches (up to 4 m). We identified previously undocumented temporal patterns in intertidal S. alveolata reef accretion and erosion, specifically groups of rapidly accreting, short-lived colonies and slow-accreting, long-lived colonies. We showed that these highly dynamic colony-scale structural changes compensate for each other, resulting in seemingly stable reef habitat structure over larger spatial and temporal scales. These patterns could only be detected with the use of modern 3D mapping technologies, demonstrating their potential to enhance our understanding of ecosystem dynamics across scales

    Global warming will affect the maximum potential abundance of boreal plant species

    Get PDF
    Forecasting the impact of future global warming on biodiversity requires understanding how temperature limits the distribution of species. Here we rely on Liebig's Law of Minimum to estimate the effect of temperature on the maximum potential abundance that a species can attain at a certain location. We develop 95%‐quantile regressions to model the influence of effective temperature sum on the maximum potential abundance of 25 common understory plant species of Finland, along 868 nationwide plots sampled in 1985. Fifteen of these species showed a significant response to temperature sum that was consistent in temperature‐only models and in all‐predictors models, which also included cumulative precipitation, soil texture, soil fertility, tree species and stand maturity as predictors. For species with significant and consistent responses to temperature, we forecasted potential shifts in abundance for the period 2041–2070 under the IPCC A1B emission scenario using temperature‐only models. We predict major potential changes in abundance and average northward distribution shifts of 6–8 km yr−1. Our results emphasize inter‐specific differences in the impact of global warming on the understory layer of boreal forests. Species in all functional groups from dwarf shrubs, herbs and grasses to bryophytes and lichens showed significant responses to temperature, while temperature did not limit the abundance of 10 species. We discuss the interest of modelling the ‘maximum potential abundance’ to deal with the uncertainty in the predictions of realized abundances associated to the effect of environmental factors not accounted for and to dispersal limitations of species, among others. We believe this concept has a promising and unexplored potential to forecast the impact of specific drivers of global change under future scenarios.202

    Prediction intervals for future BMI values of individual children - a non-parametric approach by quantile boosting

    Get PDF
    Background: The construction of prediction intervals (PIs) for future body mass index (BMI) values of individual children based on a recent German birth cohort study with n = 2007 children is problematic for standard parametric approaches, as the BMI distribution in childhood is typically skewed depending on age. Methods: We avoid distributional assumptions by directly modelling the borders of PIs by additive quantile regression, estimated by boosting. We point out the concept of conditional coverage to prove the accuracy of PIs. As conditional coverage can hardly be evaluated in practical applications, we conduct a simulation study before fitting child- and covariate-specific PIs for future BMI values and BMI patterns for the present data. Results: The results of our simulation study suggest that PIs fitted by quantile boosting cover future observations with the predefined coverage probability and outperform the benchmark approach. For the prediction of future BMI values, quantile boosting automatically selects informative covariates and adapts to the age-specific skewness of the BMI distribution. The lengths of the estimated PIs are child-specific and increase, as expected, with the age of the child. Conclusions: Quantile boosting is a promising approach to construct PIs with correct conditional coverage in a non-parametric way. It is in particular suitable for the prediction of BMI patterns depending on covariates, since it provides an interpretable predictor structure, inherent variable selection properties and can even account for longitudinal data structures
    • 

    corecore