221 research outputs found
Optimization Under Uncertainty Using the Generalized Inverse Distribution Function
A framework for robust optimization under uncertainty based on the use of the
generalized inverse distribution function (GIDF), also called quantile
function, is here proposed. Compared to more classical approaches that rely on
the usage of statistical moments as deterministic attributes that define the
objectives of the optimization process, the inverse cumulative distribution
function allows for the use of all the possible information available in the
probabilistic domain. Furthermore, the use of a quantile based approach leads
naturally to a multi-objective methodology which allows an a-posteriori
selection of the candidate design based on risk/opportunity criteria defined by
the designer. Finally, the error on the estimation of the objectives due to the
resolution of the GIDF will be proven to be quantifiableComment: 20 pages, 25 figure
New distance measures for classifying X-ray astronomy data into stellar classes
The classification of the X-ray sources into classes (such as extragalactic
sources, background stars, ...) is an essential task in astronomy. Typically,
one of the classes corresponds to extragalactic radiation, whose photon
emission behaviour is well characterized by a homogeneous Poisson process. We
propose to use normalized versions of the Wasserstein and Zolotarev distances
to quantify the deviation of the distribution of photon interarrival times from
the exponential class. Our main motivation is the analysis of a massive dataset
from X-ray astronomy obtained by the Chandra Orion Ultradeep Project (COUP).
This project yielded a large catalog of 1616 X-ray cosmic sources in the Orion
Nebula region, with their series of photon arrival times and associated
energies. We consider the plug-in estimators of these metrics, determine their
asymptotic distributions, and illustrate their finite-sample performance with a
Monte Carlo study. We estimate these metrics for each COUP source from three
different classes. We conclude that our proposal provides a striking amount of
information on the nature of the photon emitting sources. Further, these
variables have the ability to identify X-ray sources wrongly catalogued before.
As an appealing conclusion, we show that some sources, previously classified as
extragalactic emissions, have a much higher probability of being young stars in
Orion Nebula.Comment: 29 page
Gene Network Reconstruction using Global-Local Shrinkage Priors
Reconstructing a gene network from high-throughput molecular data is an important but challenging task, as the number of parameters to estimate easily is much larger than the sample size. A conventional remedy is to regularize or penalize the model likelihood. In network models, this is often done in the neighborhood of each node or gene. However, estimation of the many regularization parameters is often difficult and can result in large statistical uncertainties. In this paper we propose to combine local regularization with shrinkage of the regularization parameters to borrow strength between genes and improve inference. We employ a simple Bayesian model with nonsparse, conjugate priors to facilitate the use of fast variational approximations to posteriors. We discuss empirical Bayes estimation of hyperparameters of the priors, and propose a novel approach to rank-based posterior thresholding. Using extensive model- and data-based simulations, we demonstrate that the proposed inference strategy outperforms popular (sparse) methods, yields more stable edges, and is more reproducible. The proposed method, termed , is then applied to Glioblastoma to investigate the interactions between genes associated with patient survival.This work was supported by the Center for Medical Systems Biology (CMSB), and the European Union Grant EpiRadBio, established by the Netherlands Genomics Initiative/Netherlands Organization for Scientific Research (NGI/NWO), nr. FP7-269553
Central Limit Theorem for Adaptative Multilevel Splitting Estimators in an Idealized Setting
International audienceThe Adaptive Multilevel Splitting algorithm is a very powerful and versatile iterative method to estimate the probability of rare events, based on an interacting particle systems. In an other article, in a so-called idealized setting, the authors prove that some associated estimators are unbiased, for each value of the size n of the systems of replicas and of resampling number k. Here we go beyond and prove these estimator's asymptotic normality when h goes to infinity, for any fixed value of k. The main ingredient is the asymptotic analysis of a functional equation on an appropriate characteristic function. Some numerical simulations illustrate the convergence to rely on Gaussian confidence intervals
A frailty model for (interval) censored family survival data, applied to the age at onset of non-physical problems
Family survival data can be used to estimate the degree of genetic and environmental contributions to the age at onset of a disease or of a specific event in life. The data can be modeled with a correlated frailty model in which the frailty variable accounts for the degree of kinship within the family. The heritability (degree of heredity) of the age at a specific event in life (or the onset of a disease) is usually defined as the proportion of variance of the survival age that is associated with genetic effects. If the survival age is (interval) censored, heritability as usually defined cannot be estimated. Instead, it is defined as the proportion of variance of the frailty associated with genetic effects. In this paper we describe a correlated frailty model to estimate the heritability and the degree of environmental effects on the age at which individuals contact a social worker for the first time and to test whether there is a difference between the survival functions of this age for twins and non-twins. © 2009 The Author(s)
Estimation of Recurrence of Colorectal Adenomas with Dependent Censoring Using Weighted Logistic Regression
In colorectal polyp prevention trials, estimation of the rate of recurrence of adenomas at the end of the trial may be complicated by dependent censoring, that is, time to follow-up colonoscopy and dropout may be dependent on time to recurrence. Assuming that the auxiliary variables capture the dependence between recurrence and censoring times, we propose to fit two working models with the auxiliary variables as covariates to define risk groups and then extend an existing weighted logistic regression method for independent censoring to each risk group to accommodate potential dependent censoring. In a simulation study, we show that the proposed method results in both a gain in efficiency and reduction in bias for estimating the recurrence rate. We illustrate the methodology by analyzing a recurrent adenoma dataset from a colorectal polyp prevention trial
Estimation of conditional laws given an extreme component
Let be a bivariate random vector. The estimation of a probability of
the form is challenging when is large, and a
fruitful approach consists in studying, if it exists, the limiting conditional
distribution of the random vector , suitably normalized, given that
is large. There already exists a wide literature on bivariate models for which
this limiting distribution exists. In this paper, a statistical analysis of
this problem is done. Estimators of the limiting distribution (which is assumed
to exist) and the normalizing functions are provided, as well as an estimator
of the conditional quantile function when the conditioning event is extreme.
Consistency of the estimators is proved and a functional central limit theorem
for the estimator of the limiting distribution is obtained. The small sample
behavior of the estimator of the conditional quantile function is illustrated
through simulations.Comment: 32 pages, 5 figur
Limiting distributions for explosive PAR(1) time series with strongly mixing innovation
This work deals with the limiting distribution of the least squares
estimators of the coefficients a r of an explosive periodic autoregressive of
order 1 (PAR(1)) time series X r = a r X r--1 +u r when the innovation {u k }
is strongly mixing. More precisely {a r } is a periodic sequence of real
numbers with period P \textgreater{} 0 and such that P r=1 |a r |
\textgreater{} 1. The time series {u r } is periodically distributed with the
same period P and satisfies the strong mixing property, so the random variables
u r can be correlated
Additive and multiplicative hazards modeling for recurrent event data analysis
<p>Abstract</p> <p>Background</p> <p>Sequentially ordered multivariate failure time or recurrent event duration data are commonly observed in biomedical longitudinal studies. In general, standard hazard regression methods cannot be applied because of correlation between recurrent failure times within a subject and induced dependent censoring. Multiplicative and additive hazards models provide the two principal frameworks for studying the association between risk factors and recurrent event durations for the analysis of multivariate failure time data.</p> <p>Methods</p> <p>Using emergency department visits data, we illustrated and compared the additive and multiplicative hazards models for analysis of recurrent event durations under (i) a varying baseline with a common coefficient effect and (ii) a varying baseline with an order-specific coefficient effect.</p> <p>Results</p> <p>The analysis showed that both additive and multiplicative hazards models, with varying baseline and common coefficient effects, gave similar results with regard to covariates selected to remain in the model of our real dataset. The confidence intervals of the multiplicative hazards model were wider than the additive hazards model for each of the recurrent events. In addition, in both models, the confidence interval gets wider as the revisit order increased because the risk set decreased as the order of visit increased.</p> <p>Conclusions</p> <p>Due to the frequency of multiple failure times or recurrent event duration data in clinical and epidemiologic studies, the multiplicative and additive hazards models are widely applicable and present different information. Hence, it seems desirable to use them, not as alternatives to each other, but together as complementary methods, to provide a more comprehensive understanding of data.</p
Infinitesimally Robust Estimation in General Smoothly Parametrized Models
We describe the shrinking neighborhood approach of Robust Statistics, which
applies to general smoothly parametrized models, especially, exponential
families. Equal generality is achieved by object oriented implementation of the
optimally robust estimators. We evaluate the estimates on real datasets from
literature by means of our R packages ROptEst and RobLox
- …