4,663 research outputs found
Prediction of remaining life of power transformers based on left truncated and right censored lifetime data
Prediction of the remaining life of high-voltage power transformers is an
important issue for energy companies because of the need for planning
maintenance and capital expenditures. Lifetime data for such transformers are
complicated because transformer lifetimes can extend over many decades and
transformer designs and manufacturing practices have evolved. We were asked to
develop statistically-based predictions for the lifetimes of an energy
company's fleet of high-voltage transmission and distribution transformers. The
company's data records begin in 1980, providing information on installation and
failure dates of transformers. Although the dataset contains many units that
were installed before 1980, there is no information about units that were
installed and failed before 1980. Thus, the data are left truncated and right
censored. We use a parametric lifetime model to describe the lifetime
distribution of individual transformers. We develop a statistical procedure,
based on age-adjusted life distributions, for computing a prediction interval
for remaining life for individual transformers now in service. We then extend
these ideas to provide predictions and prediction intervals for the cumulative
number of failures, over a range of time, for the overall fleet of
transformers.Comment: Published in at http://dx.doi.org/10.1214/00-AOAS231 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Density Regression Based on Proportional Hazards Family
This paper develops a class of density regression models based on proportional hazards family, namely, Gamma transformation proportional hazard (Gt-PH) model . Exact inference for the regression parameters and hazard ratio is derived. These estimators enjoy some good properties such as unbiased estimation, which may not be shared by other inference methods such as maximum likelihood estimate (MLE). Generalised confidence interval and hypothesis testing for regression parameters are also provided. The method itself is easy to implement in practice. The regression method is also extended to Lasso-based variable selection.National Natural Science Foundation of China (Grant No. 71490725, 71071087 and 11261048
FamEvent: An R Package for Generating and Modeling Time-to-Event Data in Family Designs
FamEvent is a comprehensive R package for simulating and modeling age-at-disease onset in families carrying a rare gene mutation. The package can simulate complex family data for variable time-to-event outcomes under three common family study designs (population, high-risk clinic and multi-stage) with various levels of missing genetic information among family members. Residual familial correlation can be induced through the inclusion of a frailty term or a second gene. Disease-gene carrier probabilities are evaluated assuming Mendelian transmission or empirically from the data. When genetic information on the disease gene is missing, an expectation-maximization algorithm is employed to calculate the carrier probabilities. Penetrance model functions with ascertainment correction adapted to the sampling design provide age-specific cumulative disease risks by sex, mutation status, and other covariates for simulated data as well as real data analysis. Robust standard errors and 95% confidence intervals are available for these estimates. Plots of pedigrees and penetrance functions based on the fitted model provide graphical displays to evaluate and summarize the models
Robust Estimation of High-Dimensional Mean Regression
Data subject to heavy-tailed errors are commonly encountered in various
scientific fields, especially in the modern era with explosion of massive data.
To address this problem, procedures based on quantile regression and Least
Absolute Deviation (LAD) regression have been devel- oped in recent years.
These methods essentially estimate the conditional median (or quantile)
function. They can be very different from the conditional mean functions when
distributions are asymmetric and heteroscedastic. How can we efficiently
estimate the mean regression functions in ultra-high dimensional setting with
existence of only the second moment? To solve this problem, we propose a
penalized Huber loss with diverging parameter to reduce biases created by the
traditional Huber loss. Such a penalized robust approximate quadratic
(RA-quadratic) loss will be called RA-Lasso. In the ultra-high dimensional
setting, where the dimensionality can grow exponentially with the sample size,
our results reveal that the RA-lasso estimator produces a consistent estimator
at the same rate as the optimal rate under the light-tail situation. We further
study the computational convergence of RA-Lasso and show that the composite
gradient descent algorithm indeed produces a solution that admits the same
optimal rate after sufficient iterations. As a byproduct, we also establish the
concentration inequality for estimat- ing population mean when there exists
only the second moment. We compare RA-Lasso with other regularized robust
estimators based on quantile regression and LAD regression. Extensive
simulation studies demonstrate the satisfactory finite-sample performance of
RA-Lasso
Monte Carlo modified profile likelihood in models for clustered data
The main focus of the analysts who deal with clustered data is usually not on
the clustering variables, and hence the group-specific parameters are treated
as nuisance. If a fixed effects formulation is preferred and the total number
of clusters is large relative to the single-group sizes, classical frequentist
techniques relying on the profile likelihood are often misleading. The use of
alternative tools, such as modifications to the profile likelihood or
integrated likelihoods, for making accurate inference on a parameter of
interest can be complicated by the presence of nonstandard modelling and/or
sampling assumptions. We show here how to employ Monte Carlo simulation in
order to approximate the modified profile likelihood in some of these
unconventional frameworks. The proposed solution is widely applicable and is
shown to retain the usual properties of the modified profile likelihood. The
approach is examined in two instances particularly relevant in applications,
i.e. missing-data models and survival models with unspecified censoring
distribution. The effectiveness of the proposed solution is validated via
simulation studies and two clinical trial applications
- …