91,694 research outputs found
Asymptotic Validity of the Bayes-Inspired Indifference Zone Procedure: The Non-Normal Known Variance Case
We consider the indifference-zone (IZ) formulation of the ranking and
selection problem in which the goal is to choose an alternative with the
largest mean with guaranteed probability, as long as the difference between
this mean and the second largest exceeds a threshold. Conservatism leads
classical IZ procedures to take too many samples in problems with many
alternatives. The Bayes-inspired Indifference Zone (BIZ) procedure, proposed in
Frazier (2014), is less conservative than previous procedures, but its proof of
validity requires strong assumptions, specifically that samples are normal, and
variances are known with an integer multiple structure. In this paper, we show
asymptotic validity of a slight modification of the original BIZ procedure as
the difference between the best alternative and the second best goes to
zero,when the variances are known and finite, and samples are independent and
identically distributed, but not necessarily normal
A robust procedure for comparing multiple means under heteroscedasticity in unbalanced designs.
Investigating differences between means of more than two groups or experimental conditions is a routine research question addressed in biology. In order to assess differences statistically, multiple comparison procedures are applied. The most prominent procedures of this type, the Dunnett and Tukey-Kramer test, control the probability of reporting at least one false positive result when the data are normally distributed and when the sample sizes and variances do not differ between groups. All three assumptions are non-realistic in biological research and any violation leads to an increased number of reported false positive results. Based on a general statistical framework for simultaneous inference and robust covariance estimators we propose a new statistical multiple comparison procedure for assessing multiple means. In contrast to the Dunnett or Tukey-Kramer tests, no assumptions regarding the distribution, sample sizes or variance homogeneity are necessary. The performance of the new procedure is assessed by means of its familywise error rate and power under different distributions. The practical merits are demonstrated by a reanalysis of fatty acid phenotypes of the bacterium Bacillus simplex from the "Evolution Canyons" I and II in Israel. The simulation results show that even under severely varying variances, the procedure controls the number of false positive findings very well. Thus, the here presented procedure works well under biologically realistic scenarios of unbalanced group sizes, non-normality and heteroscedasticity
Recommended from our members
Covariate-assisted ranking and screening for large-scale two-sample inference
Two-sample multiple testing has a wide range of applications. The conventionalpractice first reduces the original observations to a vector of p-values and then chooses a cutoffto adjust for multiplicity. However, this data reduction step could cause significant loss ofinformation and thus lead to suboptimal testing procedures.We introduce a new framework fortwo-sample multiple testing by incorporating a carefully constructed auxiliary variable in inferenceto improve the power. A data-driven multiple-testing procedure is developed by employinga covariate-assisted ranking and screening (CARS) approach that optimally combines the informationfrom both the primary and the auxiliary variables. The proposed CARS procedureis shown to be asymptotically valid and optimal for false discovery rate control. The procedureis implemented in the R package CARS. Numerical results confirm the effectiveness of CARSin false discovery rate control and show that it achieves substantial power gain over existingmethods. CARS is also illustrated through an application to the analysis of a satellite imagingdata set for supernova detection
Optimal Tests of Treatment Effects for the Overall Population and Two Subpopulations in Randomized Trials, using Sparse Linear Programming
We propose new, optimal methods for analyzing randomized trials, when it is
suspected that treatment effects may differ in two predefined subpopulations.
Such sub-populations could be defined by a biomarker or risk factor measured at
baseline. The goal is to simultaneously learn which subpopulations benefit from
an experimental treatment, while providing strong control of the familywise
Type I error rate. We formalize this as a multiple testing problem and show it
is computationally infeasible to solve using existing techniques. Our solution
involves a novel approach, in which we first transform the original multiple
testing problem into a large, sparse linear program. We then solve this problem
using advanced optimization techniques. This general method can solve a variety
of multiple testing problems and decision theory problems related to optimal
trial design, for which no solution was previously available. In particular, we
construct new multiple testing procedures that satisfy minimax and Bayes
optimality criteria. For a given optimality criterion, our new approach yields
the optimal tradeoff? between power to detect an effect in the overall
population versus power to detect effects in subpopulations. We demonstrate our
approach in examples motivated by two randomized trials of new treatments for
HIV
Selection Procedures for Order Statistics in Empirical Economic Studies
In a presentation to the American Economics Association, McCloskey (1998) argued that "statistical significance is bankrupt" and that economists' time would be "better spent on finding out How Big Is Big". This brief survey is devoted to methods of determining "How Big Is Big". It is concerned with a rich body of literature called selection procedures, which are statistical methods that allow inference on order statistics and which enable empiricists to attach confidence levels to statements about the relative magnitudes of population parameters (i.e. How Big Is Big). Despite their prolonged existence and common use in other fields, selection procedures have gone relatively unnoticed in the field of economics, and, perhaps, their use is long overdue. The purpose of this paper is to provide a brief survey of selection procedures as an introduction to economists and econometricians and to illustrate their use in economics by discussing a few potential applications. Both simulated and empirical examples are provided.Ranking and selection, multiple comparisons, hypothesis testing
Decision theory results for one-sided multiple comparison procedures
A resurgence of interest in multiple hypothesis testing has occurred in the
last decade. Motivated by studies in genomics, microarrays, DNA sequencing,
drug screening, clinical trials, bioassays, education and psychology,
statisticians have been devoting considerable research energy in an effort to
properly analyze multiple endpoint data. In response to new applications, new
criteria and new methodology, many ad hoc procedures have emerged. The
classical requirement has been to use procedures which control the strong
familywise error rate (FWE) at some predetermined level \alpha. That is, the
probability of any false rejection of a true null hypothesis should be less
than or equal to \alpha. Finding desirable and powerful multiple test
procedures is difficult under this requirement. One of the more recent ideas is
concerned with controlling the false discovery rate (FDR), that is, the
expected proportion of rejected hypotheses which are, in fact, true. Many
multiple test procedures do control the FDR. A much earlier approach to
multiple testing was formulated by Lehmann [Ann. Math. Statist. 23 (1952)
541-552 and 28 (1957) 1-25]. Lehmann's approach is decision theoretic and he
treats the multiple endpoints problem as a 2^k finite action problem when there
are k endpoints. This approach is appealing since unlike the FWE and FDR
criteria, the finite action approach pays attention to false acceptances as
well as false rejections.Comment: Published at http://dx.doi.org/10.1214/009053604000000968 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Shrinkage Estimation in Multilevel Normal Models
This review traces the evolution of theory that started when Charles Stein in
1955 [In Proc. 3rd Berkeley Sympos. Math. Statist. Probab. I (1956) 197--206,
Univ. California Press] showed that using each separate sample mean from
Normal populations to estimate its own population mean can be
improved upon uniformly for every possible . The
dominating estimators, referred to here as being "Model-I minimax," can be
found by shrinking the sample means toward any constant vector. Admissible
minimax shrinkage estimators were derived by Stein and others as posterior
means based on a random effects model, "Model-II" here, wherein the
values have their own distributions. Section 2 centers on Figure 2, which
organizes a wide class of priors on the unknown Level-II hyperparameters that
have been proved to yield admissible Model-I minimax shrinkage estimators in
the "equal variance case." Putting a flat prior on the Level-II variance is
unique in this class for its scale-invariance and for its conjugacy, and it
induces Stein's harmonic prior (SHP) on .Comment: Published in at http://dx.doi.org/10.1214/11-STS363 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …