41 research outputs found

    Twenty years of P-splines

    Get PDF
    P-splines first appeared in the limelight twenty years ago. Since then they have become popular in applications and in theoretical work. The combination of a rich B-spline basis and a simple difference penalty lends itself well to a variety of generalizations, because it is based on regression. In effect, P-splines allow the building of a “backbone” for the “mixing and matching” of a variety of additive smooth structure components, while inviting all sorts of extensions: varying-coefficient effects, signal (functional) regressors, two-dimensional surfaces, non-normal responses, quantile (expectile) modelling, among others. Strong connections with mixed models and Bayesian analysis have been established. We give an overview of many of the central developments during the first two decades of P-splines.Peer Reviewe

    Twenty years of P-splines

    Get PDF
    P-splines first appeared in the limelight twenty years ago. Since then they have become popular in applications and in theoretical work. The combination of a rich B-spline basis and a simple difference penalty lends itself well to a variety of generalizations, because it is based on regression. In effect, P-splines allow the building of a “backbone” for the “mixing and matching” of a variety of additive smooth structure components, while inviting all sorts of extensions: varying-coefficient effects, signal (functional) regressors, two-dimensional surfaces, non-normal responses, quantile (expectile) modelling, among others. Strong connections with mixed models and Bayesian analysis have been established. We give an overview of many of the central developments during the first two decades of P-splines

    Multidimensional adaptive P-splines with application to neurons' activity studies

    Get PDF
    The receptive field (RF) of a visual neuron is the region of the space that elicits neuronal responses. It can be mapped using different techniques that allow inferring its spatial and temporal properties. Raw RF maps (RFmaps) are usually noisy, making it difficult to obtain and study important features of the RF. A possible solution is to smooth them using P-splines. Yet, raw RFmaps are characterized by sharp transitions in both space and time. Their analysis thus asks for spatiotemporal adaptive P-spline models, where smoothness can be locally adapted to the data. However, the literature lacks proposals for adaptive P-splines in more than two dimensions. Furthermore, the extra flexibility afforded by adaptive P-spline models is obtained at the cost of a high computational burden, especially in a multidimensional setting. To fill these gaps, this work presents a novel anisotropic locally adaptive P-spline model in two (e.g., space) and three (space and time) dimensions. Estimation is based on the recently proposed SOP (Separation of Overlapping Precision matrices) method, which provides the speed we look for. Besides the spatiotemporal analysis of the neuronal activity data that motivated this work, the practical performance of the proposal is evaluated through simulations, and comparisons with alternative methods are reported.</p

    Penalized composite link models for aggregated spatial count data: A mixed model approach

    Get PDF
    Mortality data provide valuable information for the study of the spatial distribution of mortality risk, in disciplines such as spatial epidemiology and public health. However, they are frequently available in an aggregated form over irregular geographical units, hindering the visualization of the underlying mortality risk. Also, it can be of interest to obtain mortality risk estimates on a finer spatial resolution, such that they can be linked to potential risk factors that are usually measured in a different spatial resolution. In this paper, we propose the use of the penalized composite link model and its mixed model representation. This model considers the nature of mortality rates by incorporating the population size at the finest resolution, and allows the creation of mortality maps at a finer scale, thus reducing the visual bias resulting from the spatial aggregation within original units. We also extend the model by considering individual random effects at the aggregated scale, in order to take into account the overdispersion. We illustrate our novel proposal using two datasets: female deaths by lung cancer in Indiana, USA, and male lip cancer incidence in Scotland counties. We also compare the performance of our proposal with the area-to-point Poisson kriging approach.We would like to thank two reviewers and an associate editor for their constructive comments and suggestions on the original manuscript. We also thank Dr. Pierre Goovaerts, who provided the high resolution population estimates described in Section 3.1. This research was supported by the Spanish Ministry of Economy and Competitiveness grants MTM2011-28285-C02-02 and MTM2014-52184-P. The research of Dae-Jin Lee was also supported by the Basque Government through the BERC 2014-2017 and ELKARTEK programs and by the Spanish Ministry of Economy and Competitiveness MINECO: BCAM Severo Ochoa excellence accreditation SEV-2013-0323. The research of Paul H. C. Eilers was also supported by the Universidad Carlos III de Madrid-Banco Santander Chair of Excellence program

    Fast smoothing parameter separation in multidimensional generalized P-splines: the SAP algorithm

    Get PDF
    A new computational algorithm for estimating the smoothing parameters of a multidimensional penalized spline generalized linear model with anisotropic penalty is presented. This new proposal is based on the mixed model representation of a multidimensional P-spline, in which the smoothing parameter for each covariate is expressed in terms of variance components. On the basis of penalized quasi-likelihood methods, closed-form expressions for the estimates of the variance components are obtained. This formulation leads to an efficient implementation that considerably reduces the computational burden. The proposed algorithm can be seen as a generalization of the algorithm by Schall (1991)-for variance components estimation-to deal with non-standard structures of the covariance matrix of the random effects. The practical performance of the proposed algorithm is evaluated by means of simulations, and comparisons with alternative methods are made on the basis of the mean square error criterion and the computing time. Finally, we illustrate our proposal with the analysis of two real datasets: a two dimensional example of historical records of monthly precipitation data in USA and a three dimensional one of mortality data from respiratory disease according to the age at death, the year of death and the month of death.The authors would like to express their gratitude for the support received in the form of the Spanish Ministry of Economy and Competitiveness grants MTM2011-28285-C02-01 and MTM2011-28285-C02-02. The research of Dae-Jin Lee was funded by an NIH grant for the Superfund Metal Mixtures, Biomarkers and Neurodevelopment project 1PA2ES016454-01A2

    An analysis of life expectancy and economic production using expectile frontier zones

    Get PDF
    The wealth of a country is assumed to have a strong non-linear influence on the life expectancy of its inhabitants. We follow up on research by Preston and study the relationship with gross domestic product. Smooth curves for the average but also for upper frontiers are constructed by a combination of least asymmetrically weighted squares and P-splines. Guidelines are given for optimizing the amount of smoothing and the definition of frontiers. The model is applied to a large set of countries in different years. It is also used to estimate life expectancy performance for individual countries and to show how it changed over time

    Bayesian density estimation from grouped continuous data

    Full text link
    Grouped data occur frequently in practice, either because of limited resolution of instruments, or because data have been summarized in relatively wide bins. A combination of the composite link model with roughness penalties is proposed to estimate smooth densities from such data in a Bayesian framework. A simulation study is used to evaluate the performances of the strategy in the estimation of a density, of its quantiles and rst moments. Two illustrations are presented: the rst one involves grouped data of lead concentrations in the blood and the second one the number of deaths due to tuberculosis in The Netherlands in wide age classes.CREATION D’OUTILS STATISTIQUES POUR L’ANALYSE DE DONNEES D’ENQUETES CENSUREES PAR INTERVALL

    Smoothing parameter selection using the L-curve

    Full text link
    peer reviewedThe L-curve method has been used to select the penalty parameter in ridge regression. We show that it is also very attractive for smoothing, because of its low computational load. Surprisingly, it also is almost insensitive to serial correlation

    Bayesian multi-dimensional density estimation with P-splines

    No full text
    Polytomous logistic regression combined with spline smoothing gives a powerful tool for Bayesian density estimation. Using fast array algorithms, multiple dimensions can be handled in a fast and uniform way. The Langevin-Hastings algorithm allows efficient sampling from the associated (re-parameterized) posterior distribution. Illustrations of density estimation are provided, as well as a new approach to smooth quantile regression

    An innovative procedure for smoothing parameter selection

    Full text link
    peer reviewedSmoothing with penalized splines calls for an automatic method to select the size of the penalty parameter λ. We propose a not well known smoothing parameter selection procedure: the L-curve method. AIC and (generalized) cross validation represent the most common choices in this kind of problems even if they indicate light smoothing when the data represent a smooth trend plus correlated noise. In those cases the L-curve is a computationally efficient alternative and robust alternative
    corecore