17,872 research outputs found
Mixtures of Regression Models for Time-Course Gene Expression Data: Evaluation of Initialization and Random Effects
Finite mixture models are routinely applied to time course microarray data.
Due to the complexity and size of this type of data the choice of good starting values plays
an important role. So far initialization strategies have only been investigated for data
from a mixture of multivariate normal distributions. In this work several initialization
procedures are evaluated for mixtures of regression models with and without random
effects in an extensive simulation study on different artificial datasets. Finally these
procedures are also applied to a real dataset from E. coli
Fast Covariance Estimation for High-dimensional Functional Data
For smoothing covariance functions, we propose two fast algorithms that scale
linearly with the number of observations per function. Most available methods
and software cannot smooth covariance matrices of dimension with
; the recently introduced sandwich smoother is an exception, but it is
not adapted to smooth covariance matrices of large dimensions such as . Covariance matrices of order , and even , are
becoming increasingly common, e.g., in 2- and 3-dimensional medical imaging and
high-density wearable sensor data. We introduce two new algorithms that can
handle very large covariance matrices: 1) FACE: a fast implementation of the
sandwich smoother and 2) SVDS: a two-step procedure that first applies singular
value decomposition to the data matrix and then smoothes the eigenvectors.
Compared to existing techniques, these new algorithms are at least an order of
magnitude faster in high dimensions and drastically reduce memory requirements.
The new algorithms provide instantaneous (few seconds) smoothing for matrices
of dimension and very fast ( 10 minutes) smoothing for
. Although SVDS is simpler than FACE, we provide ready to use,
scalable R software for FACE. When incorporated into R package {\it refund},
FACE improves the speed of penalized functional regression by an order of
magnitude, even for data of normal size (). We recommend that FACE be
used in practice for the analysis of noisy and high-dimensional functional
data.Comment: 35 pages, 4 figure
Nonlinear association structures in flexible Bayesian additive joint models
Joint models of longitudinal and survival data have become an important tool
for modeling associations between longitudinal biomarkers and event processes.
The association between marker and log-hazard is assumed to be linear in
existing shared random effects models, with this assumption usually remaining
unchecked. We present an extended framework of flexible additive joint models
that allows the estimation of nonlinear, covariate specific associations by
making use of Bayesian P-splines. Our joint models are estimated in a Bayesian
framework using structured additive predictors for all model components,
allowing for great flexibility in the specification of smooth nonlinear,
time-varying and random effects terms for longitudinal submodel, survival
submodel and their association. The ability to capture truly linear and
nonlinear associations is assessed in simulations and illustrated on the widely
studied biomedical data on the rare fatal liver disease primary biliary
cirrhosis. All methods are implemented in the R package bamlss to facilitate
the application of this flexible joint model in practice.Comment: Changes to initial commit: minor language editing, additional
information in Section 4, formatting in Supplementary Informatio
Semiparametric Multinomial Logit Models for Analysing Consumer Choice Behaviour
The multinomial logit model (MNL) is one of the most frequently used statistical models in marketing applications. It allows to relate an unordered categorical response variable, for example representing the choice of a brand, to a vector of covariates such as the price of the brand or variables characterising the consumer. In its classical form, all covariates enter in strictly parametric, linear form into the utility function of the MNL model. In this paper, we introduce semiparametric extensions, where smooth effects of continuous covariates are modelled by penalised splines. A mixed model representation of these penalised splines is employed to obtain estimates of the corresponding smoothing parameters, leading to a fully automated estimation procedure. To validate semiparametric models against parametric models, we utilise proper scoring rules and compare parametric and semiparametric approaches for a number of brand choice data sets
Penalized Likelihood and Bayesian Function Selection in Regression Models
Challenging research in various fields has driven a wide range of
methodological advances in variable selection for regression models with
high-dimensional predictors. In comparison, selection of nonlinear functions in
models with additive predictors has been considered only more recently. Several
competing suggestions have been developed at about the same time and often do
not refer to each other. This article provides a state-of-the-art review on
function selection, focusing on penalized likelihood and Bayesian concepts,
relating various approaches to each other in a unified framework. In an
empirical comparison, also including boosting, we evaluate several methods
through applications to simulated and real data, thereby providing some
guidance on their performance in practice
- …