6,327 research outputs found

    Estimation and Regularization Techniques for Regression Models with Multidimensional Prediction Functions

    Get PDF
    Boosting is one of the most important methods for fitting regression models and building prediction rules from high-dimensional data. A notable feature of boosting is that the technique has a built-in mechanism for shrinking coefficient estimates and variable selection. This regularization mechanism makes boosting a suitable method for analyzing data characterized by small sample sizes and large numbers of predictors. We extend the existing methodology by developing a boosting method for prediction functions with multiple components. Such multidimensional functions occur in many types of statistical models, for example in count data models and in models involving outcome variables with a mixture distribution. As will be demonstrated, the new algorithm is suitable for both the estimation of the prediction function and regularization of the estimates. In addition, nuisance parameters can be estimated simultaneously with the prediction function

    An update on statistical boosting in biomedicine

    Get PDF
    Statistical boosting algorithms have triggered a lot of research during the last decade. They combine a powerful machine-learning approach with classical statistical modelling, offering various practical advantages like automated variable selection and implicit regularization of effect estimates. They are extremely flexible, as the underlying base-learners (regression functions defining the type of effect for the explanatory variables) can be combined with any kind of loss function (target function to be optimized, defining the type of regression setting). In this review article, we highlight the most recent methodological developments on statistical boosting regarding variable selection, functional regression and advanced time-to-event modelling. Additionally, we provide a short overview on relevant applications of statistical boosting in biomedicine

    Functional Regression

    Full text link
    Functional data analysis (FDA) involves the analysis of data whose ideal units of observation are functions defined on some continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the development of this field, which has accelerated in the past 10 years to become one of the fastest growing areas of statistics, fueled by the growing number of applications yielding this type of data. One unique characteristic of FDA is the need to combine information both across and within functions, which Ramsay and Silverman called replication and regularization, respectively. This article will focus on functional regression, the area of FDA that has received the most attention in applications and methodological development. First will be an introduction to basis functions, key building blocks for regularization in functional regression methods, followed by an overview of functional regression methods, split into three types: [1] functional predictor regression (scalar-on-function), [2] functional response regression (function-on-scalar) and [3] function-on-function regression. For each, the role of replication and regularization will be discussed and the methodological development described in a roughly chronological manner, at times deviating from the historical timeline to group together similar methods. The primary focus is on modeling and methodology, highlighting the modeling structures that have been developed and the various regularization approaches employed. At the end is a brief discussion describing potential areas of future development in this field

    Tensor Regression with Applications in Neuroimaging Data Analysis

    Get PDF
    Classical regression methods treat covariates as a vector and estimate a corresponding vector of regression coefficients. Modern applications in medical imaging generate covariates of more complex form such as multidimensional arrays (tensors). Traditional statistical and computational methods are proving insufficient for analysis of these high-throughput data due to their ultrahigh dimensionality as well as complex structure. In this article, we propose a new family of tensor regression models that efficiently exploit the special structure of tensor covariates. Under this framework, ultrahigh dimensionality is reduced to a manageable level, resulting in efficient estimation and prediction. A fast and highly scalable estimation algorithm is proposed for maximum likelihood estimation and its associated asymptotic properties are studied. Effectiveness of the new methods is demonstrated on both synthetic and real MRI imaging data.Comment: 27 pages, 4 figure

    Network Psychometrics

    Full text link
    This chapter provides a general introduction of network modeling in psychometrics. The chapter starts with an introduction to the statistical model formulation of pairwise Markov random fields (PMRF), followed by an introduction of the PMRF suitable for binary data: the Ising model. The Ising model is a model used in ferromagnetism to explain phase transitions in a field of particles. Following the description of the Ising model in statistical physics, the chapter continues to show that the Ising model is closely related to models used in psychometrics. The Ising model can be shown to be equivalent to certain kinds of logistic regression models, loglinear models and multi-dimensional item response theory (MIRT) models. The equivalence between the Ising model and the MIRT model puts standard psychometrics in a new light and leads to a strikingly different interpretation of well-known latent variable models. The chapter gives an overview of methods that can be used to estimate the Ising model, and concludes with a discussion on the interpretation of latent variables given the equivalence between the Ising model and MIRT.Comment: In Irwing, P., Hughes, D., and Booth, T. (2018). The Wiley Handbook of Psychometric Testing, 2 Volume Set: A Multidisciplinary Reference on Survey, Scale and Test Development. New York: Wile
    • ā€¦
    corecore