672,774 research outputs found
Recommended from our members
Methods for functional regression and nonlinear mixed-effects models with applications to PET data
The overall theme of this thesis focuses on methods for functional regression and nonlinear mixed-effects models with applications to PET data.
The first part considers the problem of variable selection in regression models with functional responses and scalar predictors. We pose the function-on-scalar model as a multivariate regression problem and use group-MCP for variable selection. We account for residual covariance by "pre-whitening" using an estimate of the covariance matrix, and establish theoretical properties for the resulting estimator. We further develop an iterative algorithm that alternately updates the spline coefficients and covariance. Our method is illustrated by the application to two-dimensional planar reaching motions in a study of the effects of stroke severity on motor control.
The second part introduces a functional data analytic approach for the estimation of the IRF, which is necessary for describing the binding behavior of the radiotracer. Virtually all existing methods have three common aspects: summarizing the entire IRF with a single scalar measure; modeling each subject separately; and the imposition of parametric restrictions on the IRF. In contrast, we propose a functional data analytic approach that regards each subject's IRF as the basic analysis unit, models multiple subjects simultaneously, and estimates the IRF nonparametrically. We pose our model as a linear mixed effect model in which shrinkage and roughness penalties are incorporated to enforce identifiability and smoothness of the estimated curves, respectively, while monotonicity and non-negativity constraints impose biological information on estimates. We illustrate this approach by applying it to clinical PET data.
The third part discusses a nonlinear mixed-effects modeling approach for PET data analysis under the assumption of a compartment model. The traditional NLS estimators of the population parameters are applied in a two-stage analysis, which brings instability issue and neglects the variation in rate parameters. In contrast, we propose to estimate the rate parameters by fitting nonlinear mixed-effects (NLME) models, in which all the subjects are modeled simultaneously by allowing rate parameters to have random effects and population parameters can be estimated directly from the joint model. Simulations are conducted to compare the power of detecting group effect in both rate parameters and summarized measures of tests based on both NLS and NLME models. We apply our NLME approach to clinical PET data to illustrate the model building procedure
An Introduction to Recursive Partitioning: Rationale, Application and Characteristics of Classification and Regression Trees, Bagging and Random Forests
Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, that can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine and bioinformatics within the past few years.
High dimensional problems are common not only in genetics, but also in some areas of psychological research, where only few subjects can be measured due to time or cost constraints, yet a large amount of data is generated for each subject. Random forests have been shown to achieve a high prediction accuracy in such applications, and provide descriptive variable importance measures reflecting the impact of each variable in both main effects and interactions.
The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application.
Application of the methods is illustrated using freely available implementations in the R system for statistical computing
Identification and Estimation of Partial Effects with Proxy Variables
I develop a new identification approach for partial effects in nonseparable
models with endogeneity. I use a proxy variable for the unobserved
heterogeneity correlated with the endogenous variable to construct a valid
control function, where the definition of a proxy variable is the same as in
the measurement error literature. The identifying assumptions are distinct from
existing methods, in particular instrumental variables and selection on
observables approaches, and I provide an alternative identification strategy in
settings where existing approaches are not applicable. Building on the
identification result, I consider three estimation approaches, ranging from
nonparametric to flexible parametric methods, and characterize asymptotic
properties of the proposed estimators.Comment: 48 pages with the appendi
Sparse Bayesian variable selection for the identification of antigenic variability in the Foot-and-Mouth disease virus
Vaccines created from closely related viruses are vital for offering protection against newly emerging strains. For Foot-and-Mouth disease virus (FMDV), where multiple serotypes co-circulate, testing large numbers of vaccines can be infeasible. Therefore the development of an in silico predictor of cross-
protection between strains is important to help optimise vaccine choice. Here we describe a novel sparse Bayesian variable selection model using spike and slab priors which is able to predict antigenic variability and identify sites which are important for the neutralisation of the virus. We are able to iden-
tify multiple residues which are known to be key indicators of antigenic variability. Many of these were not identified previously using frequentist mixed-effects models and still cannot be found when an ℓ1 penalty is used. We further explore how the Markov chain Monte Carlo (MCMC) proposal method for the inclusion of variables can offer significant reductions in computational requirements, both for spike and slab priors in general, and
our hierarchical Bayesian model in particular
Design of Experiments for Screening
The aim of this paper is to review methods of designing screening
experiments, ranging from designs originally developed for physical experiments
to those especially tailored to experiments on numerical models. The strengths
and weaknesses of the various designs for screening variables in numerical
models are discussed. First, classes of factorial designs for experiments to
estimate main effects and interactions through a linear statistical model are
described, specifically regular and nonregular fractional factorial designs,
supersaturated designs and systematic fractional replicate designs. Generic
issues of aliasing, bias and cancellation of factorial effects are discussed.
Second, group screening experiments are considered including factorial group
screening and sequential bifurcation. Third, random sampling plans are
discussed including Latin hypercube sampling and sampling plans to estimate
elementary effects. Fourth, a variety of modelling methods commonly employed
with screening designs are briefly described. Finally, a novel study
demonstrates six screening methods on two frequently-used exemplars, and their
performances are compared
Effect fusion using model-based clustering
In social and economic studies many of the collected variables are measured
on a nominal scale, often with a large number of categories. The definition of
categories is usually not unambiguous and different classification schemes
using either a finer or a coarser grid are possible. Categorisation has an
impact when such a variable is included as covariate in a regression model: a
too fine grid will result in imprecise estimates of the corresponding effects,
whereas with a too coarse grid important effects will be missed, resulting in
biased effect estimates and poor predictive performance.
To achieve automatic grouping of levels with essentially the same effect, we
adopt a Bayesian approach and specify the prior on the level effects as a
location mixture of spiky normal components. Fusion of level effects is induced
by a prior on the mixture weights which encourages empty components.
Model-based clustering of the effects during MCMC sampling allows to
simultaneously detect categories which have essentially the same effect size
and identify variables with no effect at all. The properties of this approach
are investigated in simulation studies. Finally, the method is applied to
analyse effects of high-dimensional categorical predictors on income in
Austria
Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models
Structured additive regression provides a general framework for complex
Gaussian and non-Gaussian regression models, with predictors comprising
arbitrary combinations of nonlinear functions and surfaces, spatial effects,
varying coefficients, random effects and further regression terms. The large
flexibility of structured additive regression makes function selection a
challenging and important task, aiming at (1) selecting the relevant
covariates, (2) choosing an appropriate and parsimonious representation of the
impact of covariates on the predictor and (3) determining the required
interactions. We propose a spike-and-slab prior structure for function
selection that allows to include or exclude single coefficients as well as
blocks of coefficients representing specific model terms. A novel
multiplicative parameter expansion is required to obtain good mixing and
convergence properties in a Markov chain Monte Carlo simulation approach and is
shown to induce desirable shrinkage properties. In simulation studies and with
(real) benchmark classification data, we investigate sensitivity to
hyperparameter settings and compare performance to competitors. The flexibility
and applicability of our approach are demonstrated in an additive piecewise
exponential model with time-varying effects for right-censored survival times
of intensive care patients with sepsis. Geoadditive and additive mixed logit
model applications are discussed in an extensive appendix
- …