Search CORE

672,774 research outputs found

Recommended from our members

Methods for functional regression and nonlinear mixed-effects models with applications to PET data

Author: Chen Yakuan
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2017
Field of study

The overall theme of this thesis focuses on methods for functional regression and nonlinear mixed-effects models with applications to PET data. The first part considers the problem of variable selection in regression models with functional responses and scalar predictors. We pose the function-on-scalar model as a multivariate regression problem and use group-MCP for variable selection. We account for residual covariance by "pre-whitening" using an estimate of the covariance matrix, and establish theoretical properties for the resulting estimator. We further develop an iterative algorithm that alternately updates the spline coefficients and covariance. Our method is illustrated by the application to two-dimensional planar reaching motions in a study of the effects of stroke severity on motor control. The second part introduces a functional data analytic approach for the estimation of the IRF, which is necessary for describing the binding behavior of the radiotracer. Virtually all existing methods have three common aspects: summarizing the entire IRF with a single scalar measure; modeling each subject separately; and the imposition of parametric restrictions on the IRF. In contrast, we propose a functional data analytic approach that regards each subject's IRF as the basic analysis unit, models multiple subjects simultaneously, and estimates the IRF nonparametrically. We pose our model as a linear mixed effect model in which shrinkage and roughness penalties are incorporated to enforce identifiability and smoothness of the estimated curves, respectively, while monotonicity and non-negativity constraints impose biological information on estimates. We illustrate this approach by applying it to clinical PET data. The third part discusses a nonlinear mixed-effects modeling approach for PET data analysis under the assumption of a compartment model. The traditional NLS estimators of the population parameters are applied in a two-stage analysis, which brings instability issue and neglects the variation in rate parameters. In contrast, we propose to estimate the rate parameters by fitting nonlinear mixed-effects (NLME) models, in which all the subjects are modeled simultaneously by allowing rate parameters to have random effects and population parameters can be estimated directly from the joint model. Simulations are conducted to compare the power of detecting group effect in both rate parameters and summarized measures of tests based on both NLS and NLME models. We apply our NLME approach to clinical PET data to illustrate the model building procedure

Columbia University Academic Commons

An Introduction to Recursive Partitioning: Rationale, Application and Characteristics of Classification and Regression Trees, Bagging and Random Forests

Author: Malley James
Strobl Carolin
Tutz Gerhard
Publication venue
Publication date: 01/04/2009
Field of study

Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, that can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine and bioinformatics within the past few years. High dimensional problems are common not only in genetics, but also in some areas of psychological research, where only few subjects can be measured due to time or cost constraints, yet a large amount of data is generated for each subject. Random forests have been shown to achieve a high prediction accuracy in such applications, and provide descriptive variable importance measures reflecting the impact of each variable in both main effects and interactions. The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application. Application of the methods is illustrated using freely available implementations in the R system for statistical computing

Crossref

Open Access LMU

PubMed Central

Identification and Estimation of Partial Effects with Proxy Variables

Author: Nagasawa Kenichi
Publication venue
Publication date: 02/10/2020
Field of study

I develop a new identification approach for partial effects in nonseparable models with endogeneity. I use a proxy variable for the unobserved heterogeneity correlated with the endogenous variable to construct a valid control function, where the definition of a proxy variable is the same as in the measurement error literature. The identifying assumptions are distinct from existing methods, in particular instrumental variables and selection on observables approaches, and I provide an alternative identification strategy in settings where existing approaches are not applicable. Building on the identification result, I consider three estimation approaches, ranging from nonparametric to flexible parametric methods, and characterize asymptotic properties of the proposed estimators.Comment: 48 pages with the appendi

arXiv.org e-Print Archive

Sparse Bayesian variable selection for the identiﬁcation of antigenic variability in the Foot-and-Mouth disease virus

Author: Davies Vinny
Harvey William
Husmeier Dirk
Maree Francois
Reeve Richard
Publication venue: PMLR
Publication date: 01/04/2014
Field of study

Vaccines created from closely related viruses are vital for oﬀering protection against newly emerging strains. For Foot-and-Mouth disease virus (FMDV), where multiple serotypes co-circulate, testing large numbers of vaccines can be infeasible. Therefore the development of an in silico predictor of cross- protection between strains is important to help optimise vaccine choice. Here we describe a novel sparse Bayesian variable selection model using spike and slab priors which is able to predict antigenic variability and identify sites which are important for the neutralisation of the virus. We are able to iden- tify multiple residues which are known to be key indicators of antigenic variability. Many of these were not identiﬁed previously using frequentist mixed-eﬀects models and still cannot be found when an ℓ1 penalty is used. We further explore how the Markov chain Monte Carlo (MCMC) proposal method for the inclusion of variables can oﬀer significant reductions in computational requirements, both for spike and slab priors in general, and our hierarchical Bayesian model in particular

Enlighten

Design of Experiments for Screening

Author: A Boukouvalas
A Marrel
A Miller
A Saltelli
A Saltelli
A.E. Vine
AB Owen
AC Atkinson
AM Dean
B Abraham
B Bettonvil
B Bettonvil
B. Tang
B. Tang
BA Jones
BA Jones
BA Jones
C Daniel
C Linkletter
C.F.J. Wu
C.F.J. Wu
CA Mauro
CE Rasmussen
CJ Marley
CR Rao
CS Cheng
D Draguljić
D Dupuy
D Scott-Drechsel
D. Xing
D.T. Voss
DA Bulutoglu
DJ Finney
DKJ Lin
DKJ Lin
EI George
F Campolongo
F Campolongo
F Satterthwaite
FKH Phoa
FKH Phoa
G Damblin
G Pujol
G.S. Watson
GEP Box
GEP Box
GEP Box
GEP Box
GM James
H Moon
H. Wan
H. Xu
H. Yang
H.B.E. Wan
HA Chipman
JL Loeppky
JPC Kleijnen
K.Q. Ye
KHV Booth
KJ Ryan
KP Burnham
KT Fang
L Pronzato
L. Xiao
M Claeys-Bruno
M Hamada
M Hamada
M Johnson
M Liu
M.A. Wolters
MD McKay
MD Morris
MD Morris
MD Morris
MJ Hall
N Durrande
NA Butler
NK Nguyen
NK Nguyen
NK Nguyen
PR Scinto
PZG Qian
PZG Qian
PZG Qian
R Dorfman
R Jin
R Joseph
RB Gramacy
RK Meyer
RL Iman
RL Plackett
RV Lenth
S Ba
SC Cotter
SG Gilmour
SM Lewis
TJ Santner
VE Bowman
W DuMouchel
W Li
W.J. Welch
WA Brenneman
WW Li
X Qu
Y Benjamini
Y Liu
Publication venue
Publication date: 18/10/2015
Field of study

The aim of this paper is to review methods of designing screening experiments, ranging from designs originally developed for physical experiments to those especially tailored to experiments on numerical models. The strengths and weaknesses of the various designs for screening variables in numerical models are discussed. First, classes of factorial designs for experiments to estimate main effects and interactions through a linear statistical model are described, specifically regular and nonregular fractional factorial designs, supersaturated designs and systematic fractional replicate designs. Generic issues of aliasing, bias and cancellation of factorial effects are discussed. Second, group screening experiments are considered including factorial group screening and sequential bifurcation. Third, random sampling plans are discussed including Latin hypercube sampling and sampling plans to estimate elementary effects. Fourth, a variety of modelling methods commonly employed with screening designs are briefly described. Finally, a novel study demonstrates six screening methods on two frequently-used exemplars, and their performances are compared

arXiv.org e-Print Archive

Crossref

Effect fusion using model-based clustering

Author: Malsiner-Walli Gertraud
Pauger Daniela
Wagner Helga
Publication venue
Publication date: 22/03/2017
Field of study

In social and economic studies many of the collected variables are measured on a nominal scale, often with a large number of categories. The definition of categories is usually not unambiguous and different classification schemes using either a finer or a coarser grid are possible. Categorisation has an impact when such a variable is included as covariate in a regression model: a too fine grid will result in imprecise estimates of the corresponding effects, whereas with a too coarse grid important effects will be missed, resulting in biased effect estimates and poor predictive performance. To achieve automatic grouping of levels with essentially the same effect, we adopt a Bayesian approach and specify the prior on the level effects as a location mixture of spiky normal components. Fusion of level effects is induced by a prior on the mixture weights which encourages empty components. Model-based clustering of the effects during MCMC sampling allows to simultaneously detect categories which have essentially the same effect size and identify variables with no effect at all. The properties of this approach are investigated in simulation studies. Finally, the method is applied to analyse effects of high-dimensional categorical predictors on income in Austria

arXiv.org e-Print Archive

Elektronische Publikationen der Wirtschaftsuniversität Wien

Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models

Author: Fabian Scheipl
Fahrmeir L.
Hothorn T.
Lewis B.
Ludwig Fahrmeir
Polson N.
Sabanés Bové D.
Scheipl F.
Scheipl F.
Scheipl F.
Thomas Kneib
Publication venue: 'Informa UK Limited'
Publication date: 02/12/2011
Field of study

Structured additive regression provides a general framework for complex Gaussian and non-Gaussian regression models, with predictors comprising arbitrary combinations of nonlinear functions and surfaces, spatial effects, varying coefficients, random effects and further regression terms. The large flexibility of structured additive regression makes function selection a challenging and important task, aiming at (1) selecting the relevant covariates, (2) choosing an appropriate and parsimonious representation of the impact of covariates on the predictor and (3) determining the required interactions. We propose a spike-and-slab prior structure for function selection that allows to include or exclude single coefficients as well as blocks of coefficients representing specific model terms. A novel multiplicative parameter expansion is required to obtain good mixing and convergence properties in a Markov chain Monte Carlo simulation approach and is shown to induce desirable shrinkage properties. In simulation studies and with (real) benchmark classification data, we investigate sensitivity to hyperparameter settings and compare performance to competitors. The flexibility and applicability of our approach are demonstrated in an additive piecewise exponential model with time-varying effects for right-censored survival times of intensive care patients with sepsis. Geoadditive and additive mixed logit model applications are discussed in an extensive appendix

arXiv.org e-Print Archive

Crossref