43,930 research outputs found
Interaction Analysis of Repeated Measure Data
Extensive penalized variable selection methods have been developed in the past two decades for analyzing high dimensional omics data, such as gene expressions, single nucleotide polymorphisms (SNPs), copy number variations (CNVs) and others. However, lipidomics data have been rarely investigated by using high dimensional variable selection methods. This package incorporates our recently developed penalization procedures to conduct interaction analysis for high dimensional lipidomics data with repeated measurements. The core module of this package is developed in C++. The development of this software package and the associated statistical methods have been partially supported by an Innovative Research Award from Johnson Cancer Research Center, Kansas State University
Functional Regression
Functional data analysis (FDA) involves the analysis of data whose ideal
units of observation are functions defined on some continuous domain, and the
observed data consist of a sample of functions taken from some population,
sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the
development of this field, which has accelerated in the past 10 years to become
one of the fastest growing areas of statistics, fueled by the growing number of
applications yielding this type of data. One unique characteristic of FDA is
the need to combine information both across and within functions, which Ramsay
and Silverman called replication and regularization, respectively. This article
will focus on functional regression, the area of FDA that has received the most
attention in applications and methodological development. First will be an
introduction to basis functions, key building blocks for regularization in
functional regression methods, followed by an overview of functional regression
methods, split into three types: [1] functional predictor regression
(scalar-on-function), [2] functional response regression (function-on-scalar)
and [3] function-on-function regression. For each, the role of replication and
regularization will be discussed and the methodological development described
in a roughly chronological manner, at times deviating from the historical
timeline to group together similar methods. The primary focus is on modeling
and methodology, highlighting the modeling structures that have been developed
and the various regularization approaches employed. At the end is a brief
discussion describing potential areas of future development in this field
Semiparametric GEE analysis in partially linear single-index models for longitudinal data
In this article, we study a partially linear single-index model for
longitudinal data under a general framework which includes both the sparse and
dense longitudinal data cases. A semiparametric estimation method based on a
combination of the local linear smoothing and generalized estimation equations
(GEE) is introduced to estimate the two parameter vectors as well as the
unknown link function. Under some mild conditions, we derive the asymptotic
properties of the proposed parametric and nonparametric estimators in different
scenarios, from which we find that the convergence rates and asymptotic
variances of the proposed estimators for sparse longitudinal data would be
substantially different from those for dense longitudinal data. We also discuss
the estimation of the covariance (or weight) matrices involved in the
semiparametric GEE method. Furthermore, we provide some numerical studies
including Monte Carlo simulation and an empirical application to illustrate our
methodology and theory.Comment: Published at http://dx.doi.org/10.1214/15-AOS1320 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Estimation and model selection in generalized additive partial linear models for correlated data with diverging number of covariates
We propose generalized additive partial linear models for complex data which
allow one to capture nonlinear patterns of some covariates, in the presence of
linear components. The proposed method improves estimation efficiency and
increases statistical power for correlated data through incorporating the
correlation information. A unique feature of the proposed method is its
capability of handling model selection in cases where it is difficult to
specify the likelihood function. We derive the quadratic inference
function-based estimators for the linear coefficients and the nonparametric
functions when the dimension of covariates diverges, and establish asymptotic
normality for the linear coefficient estimators and the rates of convergence
for the nonparametric functions estimators for both finite and high-dimensional
cases. The proposed method and theoretical development are quite challenging
since the numbers of linear covariates and nonlinear components both increase
as the sample size increases. We also propose a doubly penalized procedure for
variable selection which can simultaneously identify nonzero linear and
nonparametric components, and which has an asymptotic oracle property.
Extensive Monte Carlo studies have been conducted and show that the proposed
procedure works effectively even with moderate sample sizes. A pharmacokinetics
study on renal cancer data is illustrated using the proposed method.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1194 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …