15,728 research outputs found
A longitudinal study of student performance in English using repeated measures, multilevel and logistic regression models
This paper presents three statistical models that analyze longitudinal data
on student performance in English. A random sample, comprising male and
female students who attend either a state or a private school, was selected to
investigate gender and school bias in this subject. The English annual marks
attained by each student were recorded during the last three years in primary
schools. In the first approach, we present a repeated measures analysis of
variance that captures the correlation between the repeated measures. Several
tests are carried out to check for within subjects and between subjects effects;
equality of covariance matrices and sphericity. In the second approach, we fit
a two-level random coefficient model to examine the effect of time on student
performance in English. This model allows the student-specific coefficients
describing individual trajectories to vary randomly. In the third approach, we
fit a Logistic regression model to estimate the probability of passing the
Eleven-Plus examination that students sit for when they terminate Primary
education.peer-reviewe
A generalized linear mixed model for longitudinal binary data with a marginal logit link function
Longitudinal studies of a binary outcome are common in the health, social,
and behavioral sciences. In general, a feature of random effects logistic
regression models for longitudinal binary data is that the marginal functional
form, when integrated over the distribution of the random effects, is no longer
of logistic form. Recently, Wang and Louis [Biometrika 90 (2003) 765--775]
proposed a random intercept model in the clustered binary data setting where
the marginal model has a logistic form. An acknowledged limitation of their
model is that it allows only a single random effect that varies from cluster to
cluster. In this paper we propose a modification of their model to handle
longitudinal data, allowing separate, but correlated, random intercepts at each
measurement occasion. The proposed model allows for a flexible correlation
structure among the random intercepts, where the correlations can be
interpreted in terms of Kendall's . For example, the marginal
correlations among the repeated binary outcomes can decline with increasing
time separation, while the model retains the property of having matching
conditional and marginal logit link functions. Finally, the proposed method is
used to analyze data from a longitudinal study designed to monitor cardiac
abnormalities in children born to HIV-infected women.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS390 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Functional Regression
Functional data analysis (FDA) involves the analysis of data whose ideal
units of observation are functions defined on some continuous domain, and the
observed data consist of a sample of functions taken from some population,
sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the
development of this field, which has accelerated in the past 10 years to become
one of the fastest growing areas of statistics, fueled by the growing number of
applications yielding this type of data. One unique characteristic of FDA is
the need to combine information both across and within functions, which Ramsay
and Silverman called replication and regularization, respectively. This article
will focus on functional regression, the area of FDA that has received the most
attention in applications and methodological development. First will be an
introduction to basis functions, key building blocks for regularization in
functional regression methods, followed by an overview of functional regression
methods, split into three types: [1] functional predictor regression
(scalar-on-function), [2] functional response regression (function-on-scalar)
and [3] function-on-function regression. For each, the role of replication and
regularization will be discussed and the methodological development described
in a roughly chronological manner, at times deviating from the historical
timeline to group together similar methods. The primary focus is on modeling
and methodology, highlighting the modeling structures that have been developed
and the various regularization approaches employed. At the end is a brief
discussion describing potential areas of future development in this field
Nonlinear quantile mixed models
In regression applications, the presence of nonlinearity and correlation
among observations offer computational challenges not only in traditional
settings such as least squares regression, but also (and especially) when the
objective function is non-smooth as in the case of quantile regression. In this
paper, we develop methods for the modeling and estimation of nonlinear
conditional quantile functions when data are clustered within two-level nested
designs. This work represents an extension of the linear quantile mixed models
of Geraci and Bottai (2014, Statistics and Computing). We develop a novel
algorithm which is a blend of a smoothing algorithm for quantile regression and
a second order Laplacian approximation for nonlinear mixed models. To assess
the proposed methods, we present a simulation study and two applications, one
in pharmacokinetics and one related to growth curve modeling in agriculture.Comment: 26 pages, 8 figures, 8 table
General Design Bayesian Generalized Linear Mixed Models
Linear mixed models are able to handle an extraordinary range of
complications in regression-type analyses. Their most common use is to account
for within-subject correlation in longitudinal data analysis. They are also the
standard vehicle for smoothing spatial count data. However, when treated in
full generality, mixed models can also handle spline-type smoothing and closely
approximate kriging. This allows for nonparametric regression models (e.g.,
additive models and varying coefficient models) to be handled within the mixed
model framework. The key is to allow the random effects design matrix to have
general structure; hence our label general design. For continuous response
data, particularly when Gaussianity of the response is reasonably assumed,
computation is now quite mature and supported by the R, SAS and S-PLUS
packages. Such is not the case for binary and count responses, where
generalized linear mixed models (GLMMs) are required, but are hindered by the
presence of intractable multivariate integrals. Software known to us supports
special cases of the GLMM (e.g., PROC NLMIXED in SAS or glmmML in R) or relies
on the sometimes crude Laplace-type approximation of integrals (e.g., the SAS
macro glimmix or glmmPQL in R). This paper describes the fitting of general
design generalized linear mixed models. A Bayesian approach is taken and Markov
chain Monte Carlo (MCMC) is used for estimation and inference. In this
generalized setting, MCMC requires sampling from nonstandard distributions. In
this article, we demonstrate that the MCMC package WinBUGS facilitates sound
fitting of general design Bayesian generalized linear mixed models in practice.Comment: Published at http://dx.doi.org/10.1214/088342306000000015 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A semiparametric regression model for paired longitudinal outcomes with application in childhood blood pressure development
This research examines the simultaneous influences of height and weight on
longitudinally measured systolic and diastolic blood pressure in children.
Previous studies have shown that both height and weight are positively
associated with blood pressure. In children, however, the concurrent increases
of height and weight have made it all but impossible to discern the effect of
height from that of weight. To better understand these influences, we propose
to examine the joint effect of height and weight on blood pressure. Bivariate
thin plate spline surfaces are used to accommodate the potentially nonlinear
effects as well as the interaction between height and weight. Moreover, we
consider a joint model for paired blood pressure measures, that is, systolic
and diastolic blood pressure, to account for the underlying correlation between
the two measures within the same individual. The bivariate spline surfaces are
allowed to vary across different groups of interest. We have developed related
model fitting and inference procedures. The proposed method is used to analyze
data from a real clinical investigation.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS567 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Linear mixed models with endogenous covariates: modeling sequential treatment effects with application to a mobile health study
Mobile health is a rapidly developing field in which behavioral treatments
are delivered to individuals via wearables or smartphones to facilitate
health-related behavior change. Micro-randomized trials (MRT) are an
experimental design for developing mobile health interventions. In an MRT the
treatments are randomized numerous times for each individual over course of the
trial. Along with assessing treatment effects, behavioral scientists aim to
understand between-person heterogeneity in the treatment effect. A natural
approach is the familiar linear mixed model. However, directly applying linear
mixed models is problematic because potential moderators of the treatment
effect are frequently endogenous---that is, may depend on prior treatment. We
discuss model interpretation and biases that arise in the absence of additional
assumptions when endogenous covariates are included in a linear mixed model. In
particular, when there are endogenous covariates, the coefficients no longer
have the customary marginal interpretation. However, these coefficients still
have a conditional-on-the-random-effect interpretation. We provide an
additional assumption that, if true, allows scientists to use standard software
to fit linear mixed model with endogenous covariates, and person-specific
predictions of effects can be provided. As an illustration, we assess the
effect of activity suggestion in the HeartSteps MRT and analyze the
between-person treatment effect heterogeneity
Variable Selection for Generalized Linear Mixed Models by L1-Penalized Estimation
Generalized linear mixed models are a widely used tool for modeling longitudinal data. However, their use is typically restricted to few covariates, because the presence of many predictors yields unstable estimates. The presented approach to the fitting of generalized linear mixed
models includes an L1-penalty term that enforces variable selection and shrinkage simultaneously. A gradient ascent algorithm is proposed that allows to maximize the penalized loglikelihood yielding models with reduced complexity. In contrast to common procedures it can be used in high-dimensional settings where a large number of otentially influential explanatory variables is available. The method is investigated in simulation studies and illustrated by use of real data sets
Variable Selection for Generalized Linear Mixed Models by L1-Penalized Estimation
Generalized linear mixed models are a widely used tool for modeling longitudinal data. However, their use is typically restricted to few covariates, because the presence of many predictors yields unstable estimates. The presented approach to the fitting of generalized linear mixed
models includes an L1-penalty term that enforces variable selection and shrinkage simultaneously. A gradient ascent algorithm is proposed that allows to maximize the penalized loglikelihood yielding models with reduced complexity. In contrast to common procedures it can be used in high-dimensional settings where a large number of otentially influential explanatory variables is available. The method is investigated in simulation studies and illustrated by use of real data sets
- ā¦