864 research outputs found
Penalized variable selection procedure for Cox models with semiparametric relative risk
We study the Cox models with semiparametric relative risk, which can be
partially linear with one nonparametric component, or multiple additive or
nonadditive nonparametric components. A penalized partial likelihood procedure
is proposed to simultaneously estimate the parameters and select variables for
both the parametric and the nonparametric parts. Two penalties are applied
sequentially. The first penalty, governing the smoothness of the multivariate
nonlinear covariate effect function, provides a smoothing spline ANOVA
framework that is exploited to derive an empirical model selection tool for the
nonparametric part. The second penalty, either the
smoothly-clipped-absolute-deviation (SCAD) penalty or the adaptive LASSO
penalty, achieves variable selection in the parametric part. We show that the
resulting estimator of the parametric part possesses the oracle property, and
that the estimator of the nonparametric part achieves the optimal rate of
convergence. The proposed procedures are shown to work well in simulation
experiments, and then applied to a real data example on sexually transmitted
diseases.Comment: Published in at http://dx.doi.org/10.1214/09-AOS780 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Most Likely Transformations
We propose and study properties of maximum likelihood estimators in the class
of conditional transformation models. Based on a suitable explicit
parameterisation of the unconditional or conditional transformation function,
we establish a cascade of increasingly complex transformation models that can
be estimated, compared and analysed in the maximum likelihood framework. Models
for the unconditional or conditional distribution function of any univariate
response variable can be set-up and estimated in the same theoretical and
computational framework simply by choosing an appropriate transformation
function and parameterisation thereof. The ability to evaluate the distribution
function directly allows us to estimate models based on the exact likelihood,
especially in the presence of random censoring or truncation. For discrete and
continuous responses, we establish the asymptotic normality of the proposed
estimators. A reference software implementation of maximum likelihood-based
estimation for conditional transformation models allowing the same flexibility
as the theory developed here was employed to illustrate the wide range of
possible applications.Comment: Accepted for publication by the Scandinavian Journal of Statistics
2017-06-1
Conditional Transformation Models
The ultimate goal of regression analysis is to obtain information about the
conditional distribution of a response given a set of explanatory variables.
This goal is, however, seldom achieved because most established regression
models only estimate the conditional mean as a function of the explanatory
variables and assume that higher moments are not affected by the regressors.
The underlying reason for such a restriction is the assumption of additivity of
signal and noise. We propose to relax this common assumption in the framework
of transformation models. The novel class of semiparametric regression models
proposed herein allows transformation functions to depend on explanatory
variables. These transformation functions are estimated by regularised
optimisation of scoring rules for probabilistic forecasts, e.g. the continuous
ranked probability score. The corresponding estimated conditional distribution
functions are consistent. Conditional transformation models are potentially
useful for describing possible heteroscedasticity, comparing spatially varying
distributions, identifying extreme events, deriving prediction intervals and
selecting variables beyond mean regression effects. An empirical investigation
based on a heteroscedastic varying coefficient simulation model demonstrates
that semiparametric estimation of conditional distribution functions can be
more beneficial than kernel-based non-parametric approaches or parametric
generalised additive models for location, scale and shape
Functional Regression
Functional data analysis (FDA) involves the analysis of data whose ideal
units of observation are functions defined on some continuous domain, and the
observed data consist of a sample of functions taken from some population,
sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the
development of this field, which has accelerated in the past 10 years to become
one of the fastest growing areas of statistics, fueled by the growing number of
applications yielding this type of data. One unique characteristic of FDA is
the need to combine information both across and within functions, which Ramsay
and Silverman called replication and regularization, respectively. This article
will focus on functional regression, the area of FDA that has received the most
attention in applications and methodological development. First will be an
introduction to basis functions, key building blocks for regularization in
functional regression methods, followed by an overview of functional regression
methods, split into three types: [1] functional predictor regression
(scalar-on-function), [2] functional response regression (function-on-scalar)
and [3] function-on-function regression. For each, the role of replication and
regularization will be discussed and the methodological development described
in a roughly chronological manner, at times deviating from the historical
timeline to group together similar methods. The primary focus is on modeling
and methodology, highlighting the modeling structures that have been developed
and the various regularization approaches employed. At the end is a brief
discussion describing potential areas of future development in this field
General Semiparametric Shared Frailty Model Estimation and Simulation with frailtySurv
The R package frailtySurv for simulating and fitting semi-parametric shared
frailty models is introduced. Package frailtySurv implements semi-parametric
consistent estimators for a variety of frailty distributions, including gamma,
log-normal, inverse Gaussian and power variance function, and provides
consistent estimators of the standard errors of the parameters' estimators. The
parameters' estimators are asymptotically normally distributed, and therefore
statistical inference based on the results of this package, such as hypothesis
testing and confidence intervals, can be performed using the normal
distribution. Extensive simulations demonstrate the flexibility and correct
implementation of the estimator. Two case studies performed with publicly
available datasets demonstrate applicability of the package. In the Diabetic
Retinopathy Study, the onset of blindness is clustered by patient, and in a
large hard drive failure dataset, failure times are thought to be clustered by
the hard drive manufacturer and model
A Semiparametrically Efficient Estimator of the Time‐Varying Effects for Survival Data with Time‐Dependent Treatment
The timing of a time‐dependent treatment—for example, when to perform a kidney transplantation—is an important factor for evaluating treatment efficacy. A naïve comparison between the treated and untreated groups, while ignoring the timing of treatment, typically yields biased results that might favour the treated group because only patients who survive long enough will get treated. On the other hand, studying the effect of a time‐dependent treatment is often complex, as it involves modelling treatment history and accounting for the possible time‐varying nature of the treatment effect. We propose a varying‐coefficient Cox model that investigates the efficacy of a time‐dependent treatment by utilizing a global partial likelihood, which renders appealing statistical properties, including consistency, asymptotic normality and semiparametric efficiency. Extensive simulations verify the finite sample performance, and we apply the proposed method to study the efficacy of kidney transplantation for end‐stage renal disease patients in the US Scientific Registry of Transplant Recipients.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134221/1/sjos12196_am.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/134221/2/sjos12196.pd
Robust Inference for Univariate Proportional Hazards Frailty Regression Models
We consider a class of semiparametric regression models which are
one-parameter extensions of the Cox [J. Roy. Statist. Soc. Ser. B 34 (1972)
187-220] model for right-censored univariate failure times. These models assume
that the hazard given the covariates and a random frailty unique to each
individual has the proportional hazards form multiplied by the frailty.
The frailty is assumed to have mean 1 within a known one-parameter family of
distributions. Inference is based on a nonparametric likelihood. The behavior
of the likelihood maximizer is studied under general conditions where the
fitted model may be misspecified. The joint estimator of the regression and
frailty parameters as well as the baseline hazard is shown to be uniformly
consistent for the pseudo-value maximizing the asymptotic limit of the
likelihood. Appropriately standardized, the estimator converges weakly to a
Gaussian process. When the model is correctly specified, the procedure is
semiparametric efficient, achieving the semiparametric information bound for
all parameter components. It is also proved that the bootstrap gives valid
inferences for all parameters, even under misspecification.
We demonstrate analytically the importance of the robust inference in several
examples. In a randomized clinical trial, a valid test of the treatment effect
is possible when other prognostic factors and the frailty distribution are both
misspecified. Under certain conditions on the covariates, the ratios of the
regression parameters are still identifiable. The practical utility of the
procedure is illustrated on a non-Hodgkin's lymphoma dataset.Comment: Published by the Institute of Mathematical Statistics
(http://www.imstat.org) in the Annals of Statistics
(http://www.imstat.org/aos/) at http://dx.doi.org/10.1214/00905360400000053
Discussion paper. Conditional growth charts
Growth charts are often more informative when they are customized per
subject, taking into account prior measurements and possibly other covariates
of the subject. We study a global semiparametric quantile regression model that
has the ability to estimate conditional quantiles without the usual
distributional assumptions. The model can be estimated from longitudinal
reference data with irregular measurement times and with some level of
robustness against outliers, and it is also flexible for including covariate
information. We propose a rank score test for large sample inference on
covariates, and develop a new model assessment tool for longitudinal growth
data. Our research indicates that the global model has the potential to be a
very useful tool in conditional growth chart analysis.Comment: This paper discussed in: [math/0702636], [math/0702640],
[math/0702641], [math/0702642]. Rejoinder in [math.ST/0702643]. Published at
http://dx.doi.org/10.1214/009053606000000623 in the Annals of Statistics
(http://www.imstat.org/aos/) by the Institute of Mathematical Statistics
(http://www.imstat.org
- …