1,851 research outputs found
Controlled stratification for quantile estimation
In this paper we propose and discuss variance reduction techniques for the
estimation of quantiles of the output of a complex model with random input
parameters. These techniques are based on the use of a reduced model, such as a
metamodel or a response surface. The reduced model can be used as a control
variate; or a rejection method can be implemented to sample the realizations of
the input parameters in prescribed relevant strata; or the reduced model can be
used to determine a good biased distribution of the input parameters for the
implementation of an importance sampling strategy. The different strategies are
analyzed and the asymptotic variances are computed, which shows the benefit of
an adaptive controlled stratification method. This method is finally applied to
a real example (computation of the peak cladding temperature during a
large-break loss of coolant accident in a nuclear reactor).Comment: Published in at http://dx.doi.org/10.1214/08-AOAS186 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
STANDARD REGRESSION VERSUS MULTILEVEL MODELING OF MULTISTAGE COMPLEX SURVEY DATA
Complex surveys based on multistage design are commonly used to collect large population data. Stratification, clustering and unequal probability of the selection of individuals are the complexities of complex survey design. Statistical techniques such as the multilevel modeling – scaled weights technique and the standard regression – robust variance estimation technique are used to analyze the complex survey data. Both statistical techniques take into account the complexities of complex survey data but the ways are different.
This thesis compares the performance of the multilevel modeling – scaled weights and the standard regression – robust variance estimation technique based on analysis of the cross-sectional and the longitudinal complex survey data. Performance of these two techniques was examined by Monte Carlo simulation based on cross-sectional complex survey design.
A stratified, multistage probability sample design was used to select samples for the cross-sectional Canadian Heart Health Surveys (CHHS) conducted in ten Canadian provinces and for the longitudinal National Population Health Survey (NPHS).
Both statistical techniques (the multilevel modeling – scaled weights and the standard regression – robust variance estimation technique) were utilized to analyze CHHS and NPHS data sets. The outcome of interest was based on the question “Do you have any of the following long-term conditions that have been diagnosed by a health professional? – Diabetes”.
For the cross-sectional CHHS, the results obtained from the proposed two statistical techniques were not consistent. However, the results based on analysis of the longitudinal NPHS data indicated that the performance of the standard regression – robust variance estimation technique might be better than the multilevel modeling – scaled weight technique for analyzing longitudinal complex survey data. Finally, in order to arrive at a definitive conclusion, a Monte Carlo simulation was used to compare the performance of the multilevel modeling – scaled weights and the standard regression – robust variance estimation techniques . In the Monte Carlo simulation study, the data were generated randomly based on the Canadian Heart Health Survey data for Saskatchewan province. The total 100 and 1000 number of simulated data sets were generated and the sample size for each simulated data set was 1,731. The results of this Monte Carlo simulation study indicated that the performance of the multilevel modeling – scaled weights technique and the standard regression – robust variance estimation technique were comparable to analyze the cross-sectional complex survey data.
To conclude, both statistical techniques yield similar results when used to analyze the cross-sectional complex survey data, however standard regression-robust variance estimation technique might be preferred because it fully accounts for stratification, clustering and unequal probability of selection
Number of Repetitions in Re-randomization Tests
In covariate-adaptive or response-adaptive randomization, the treatment
assignment and outcome can be correlated. Under this situation,
re-randomization tests are a straightforward and attractive method to provide
valid statistical inference. In this paper, we investigate the number of
repetitions in the re-randomization tests. This is motivated by the group
sequential design in clinical trials, where the nominal significance bound can
be very small at an interim analysis. Accordingly, re-randomization tests lead
to a very large number of required repetitions, which may be computationally
intractable. To reduce the number of repetitions, we propose an adaptive
procedure and compare it with multiple approaches under pre-defined criteria.
Monte Carlo simulations are conducted to show the performance of different
approaches in a limited sample size. We also suggest strategies to reduce total
computation time and provide practical guidance in preparing, executing and
reporting before and after data are unblinded at an interim analysis, so one
can complete the computation within a reasonable time frame
Estimating regional income indicators under transformations and access to limited population auxiliary information
Spatially disaggregated income indicators are typically estimated by using model-based methods that assume access to auxiliary information from population micro-data. In many countries like Germany and the UK population micro-data are not publicly available. In this work we propose small area methodology when only aggregate population-level auxiliary information is available. We use data-driven transformations of the response to satisfy the parametric assumptions of the used models. In the absence of population micro-data, appropriate bias-corrections for small area prediction are needed. Under the approach we propose in this paper, aggregate statistics (means and covariances) and kernel density estimation are used to resolve the issue of not having access to population micro-data. We further explore the estimation of the mean squared error using the parametric bootstrap. Extensive model-based and design-based simulations are used to compare the proposed method to alternative methods. Finally, the proposed methodology is applied to the 2011 Socio-Economic Panel and aggregate census information from the same year to estimate the average income for 96 regional planning regions in Germany
Six Statistical Senses
This article proposes a set of categories, each one representing a particular
distillation of important statistical ideas. Each category is labeled a "sense"
because we think of these as essential in helping every statistical mind
connect in constructive and insightful ways with statistical theory,
methodologies, and computation, toward the ultimate goal of building
statistical phronesis. The illustration of each sense with statistical
principles and methods provides a sensical tour of the conceptual landscape of
statistics, as a leading discipline in the data science ecosystem
Covariate Adjustment in Bayesian Adaptive Clinical Trials
In conventional randomized controlled trials, adjustment for baseline values
of covariates known to be at least moderately associated with the outcome
increases the power of the trial. Recent work has shown particular benefit for
more flexible frequentist designs, such as information adaptive and adaptive
multi-arm designs. However, covariate adjustment has not been characterized
within the more flexible Bayesian adaptive designs, despite their growing
popularity. We focus on a subclass of these which allow for early stopping at
an interim analysis given evidence of treatment superiority. We consider both
collapsible and non-collapsible estimands, and show how to obtain posterior
samples of marginal estimands from adjusted analyses. We describe several
estimands for three common outcome types. We perform a simulation study to
assess the impact of covariate adjustment using a variety of adjustment models
in several different scenarios. This is followed by a real world application of
the compared approaches to a COVID-19 trial with a binary endpoint. For all
scenarios, it is shown that covariate adjustment increases power and the
probability of stopping the trials early, and decreases the expected sample
sizes as compared to unadjusted analyses.Comment: 17 pages, 5 tables, 4 figure
- …