17,872 research outputs found

    Mixtures of Regression Models for Time-Course Gene Expression Data: Evaluation of Initialization and Random Effects

    Get PDF
    Finite mixture models are routinely applied to time course microarray data. Due to the complexity and size of this type of data the choice of good starting values plays an important role. So far initialization strategies have only been investigated for data from a mixture of multivariate normal distributions. In this work several initialization procedures are evaluated for mixtures of regression models with and without random effects in an extensive simulation study on different artificial datasets. Finally these procedures are also applied to a real dataset from E. coli

    Fast Covariance Estimation for High-dimensional Functional Data

    Get PDF
    For smoothing covariance functions, we propose two fast algorithms that scale linearly with the number of observations per function. Most available methods and software cannot smooth covariance matrices of dimension J×JJ \times J with J>500J>500; the recently introduced sandwich smoother is an exception, but it is not adapted to smooth covariance matrices of large dimensions such as J10,000J \ge 10,000. Covariance matrices of order J=10,000J=10,000, and even J=100,000J=100,000, are becoming increasingly common, e.g., in 2- and 3-dimensional medical imaging and high-density wearable sensor data. We introduce two new algorithms that can handle very large covariance matrices: 1) FACE: a fast implementation of the sandwich smoother and 2) SVDS: a two-step procedure that first applies singular value decomposition to the data matrix and then smoothes the eigenvectors. Compared to existing techniques, these new algorithms are at least an order of magnitude faster in high dimensions and drastically reduce memory requirements. The new algorithms provide instantaneous (few seconds) smoothing for matrices of dimension J=10,000J=10,000 and very fast (<< 10 minutes) smoothing for J=100,000J=100,000. Although SVDS is simpler than FACE, we provide ready to use, scalable R software for FACE. When incorporated into R package {\it refund}, FACE improves the speed of penalized functional regression by an order of magnitude, even for data of normal size (J<500J <500). We recommend that FACE be used in practice for the analysis of noisy and high-dimensional functional data.Comment: 35 pages, 4 figure

    Nonlinear association structures in flexible Bayesian additive joint models

    Full text link
    Joint models of longitudinal and survival data have become an important tool for modeling associations between longitudinal biomarkers and event processes. The association between marker and log-hazard is assumed to be linear in existing shared random effects models, with this assumption usually remaining unchecked. We present an extended framework of flexible additive joint models that allows the estimation of nonlinear, covariate specific associations by making use of Bayesian P-splines. Our joint models are estimated in a Bayesian framework using structured additive predictors for all model components, allowing for great flexibility in the specification of smooth nonlinear, time-varying and random effects terms for longitudinal submodel, survival submodel and their association. The ability to capture truly linear and nonlinear associations is assessed in simulations and illustrated on the widely studied biomedical data on the rare fatal liver disease primary biliary cirrhosis. All methods are implemented in the R package bamlss to facilitate the application of this flexible joint model in practice.Comment: Changes to initial commit: minor language editing, additional information in Section 4, formatting in Supplementary Informatio

    Semiparametric Multinomial Logit Models for Analysing Consumer Choice Behaviour

    Get PDF
    The multinomial logit model (MNL) is one of the most frequently used statistical models in marketing applications. It allows to relate an unordered categorical response variable, for example representing the choice of a brand, to a vector of covariates such as the price of the brand or variables characterising the consumer. In its classical form, all covariates enter in strictly parametric, linear form into the utility function of the MNL model. In this paper, we introduce semiparametric extensions, where smooth effects of continuous covariates are modelled by penalised splines. A mixed model representation of these penalised splines is employed to obtain estimates of the corresponding smoothing parameters, leading to a fully automated estimation procedure. To validate semiparametric models against parametric models, we utilise proper scoring rules and compare parametric and semiparametric approaches for a number of brand choice data sets

    Penalized Likelihood and Bayesian Function Selection in Regression Models

    Full text link
    Challenging research in various fields has driven a wide range of methodological advances in variable selection for regression models with high-dimensional predictors. In comparison, selection of nonlinear functions in models with additive predictors has been considered only more recently. Several competing suggestions have been developed at about the same time and often do not refer to each other. This article provides a state-of-the-art review on function selection, focusing on penalized likelihood and Bayesian concepts, relating various approaches to each other in a unified framework. In an empirical comparison, also including boosting, we evaluate several methods through applications to simulated and real data, thereby providing some guidance on their performance in practice
    corecore