19,081 research outputs found
Garbled Elections
Majority rules are frequently used to decide whether or not a public good should be provided, but will typically fail to achieve an efficient provision. We provide a worst-case analysis of the majority rule with an optimally chosen majority threshold, assuming that voters have independent private valuations and are exante symmetric (provision cost shares are included in the valuations). We show that if the population is large it can happen that the optimal majority rule is essentially no better than a random provision of the public good. But the optimal majority rule is worst-case asymptotically efficient in the large-population limit if (i) the voters’ expected valuation is bounded away from 0, and (ii) an absolute bound for valuations is known
Asymptotic Bayes-optimality under sparsity of some multiple testing procedures
Within a Bayesian decision theoretic framework we investigate some asymptotic
optimality properties of a large class of multiple testing rules. A parametric
setup is considered, in which observations come from a normal scale mixture
model and the total loss is assumed to be the sum of losses for individual
tests. Our model can be used for testing point null hypotheses, as well as to
distinguish large signals from a multitude of very small effects. A rule is
defined to be asymptotically Bayes optimal under sparsity (ABOS), if within our
chosen asymptotic framework the ratio of its Bayes risk and that of the Bayes
oracle (a rule which minimizes the Bayes risk) converges to one. Our main
interest is in the asymptotic scheme where the proportion p of "true"
alternatives converges to zero. We fully characterize the class of fixed
threshold multiple testing rules which are ABOS, and hence derive conditions
for the asymptotic optimality of rules controlling the Bayesian False Discovery
Rate (BFDR). We finally provide conditions under which the popular
Benjamini-Hochberg (BH) and Bonferroni procedures are ABOS and show that for a
wide class of sparsity levels, the threshold of the former can be approximated
by a nonrandom threshold.Comment: Published in at http://dx.doi.org/10.1214/10-AOS869 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Weighted False Discovery Rate Control in Large-Scale Multiple Testing
The use of weights provides an effective strategy to incorporate prior domain
knowledge in large-scale inference. This paper studies weighted multiple
testing in a decision-theoretic framework. We develop oracle and data-driven
procedures that aim to maximize the expected number of true positives subject
to a constraint on the weighted false discovery rate. The asymptotic validity
and optimality of the proposed methods are established. The results demonstrate
that incorporating informative domain knowledge enhances the interpretability
of results and precision of inference. Simulation studies show that the
proposed method controls the error rate at the nominal level, and the gain in
power over existing methods is substantial in many settings. An application to
genome-wide association study is discussed.Comment: Revise
A Linear Programming Approach to Sequential Hypothesis Testing
Under some mild Markov assumptions it is shown that the problem of designing
optimal sequential tests for two simple hypotheses can be formulated as a
linear program. The result is derived by investigating the Lagrangian dual of
the sequential testing problem, which is an unconstrained optimal stopping
problem, depending on two unknown Lagrangian multipliers. It is shown that the
derivative of the optimal cost function with respect to these multipliers
coincides with the error probabilities of the corresponding sequential test.
This property is used to formulate an optimization problem that is jointly
linear in the cost function and the Lagrangian multipliers and an be solved for
both with off-the-shelf algorithms. To illustrate the procedure, optimal
sequential tests for Gaussian random sequences with different dependency
structures are derived, including the Gaussian AR(1) process.Comment: 25 pages, 4 figures, accepted for publication in Sequential Analysi
A comparative study of nonparametric methods for pattern recognition
The applied research discussed in this report determines and compares the correct classification percentage of the nonparametric sign test, Wilcoxon's signed rank test, and K-class classifier with the performance of the Bayes classifier. The performance is determined for data which have Gaussian, Laplacian and Rayleigh probability density functions. The correct classification percentage is shown graphically for differences in modes and/or means of the probability density functions for four, eight and sixteen samples. The K-class classifier performed very well with respect to the other classifiers used. Since the K-class classifier is a nonparametric technique, it usually performed better than the Bayes classifier which assumes the data to be Gaussian even though it may not be. The K-class classifier has the advantage over the Bayes in that it works well with non-Gaussian data without having to determine the probability density function of the data. It should be noted that the data in this experiment was always unimodal
Adaptive Threshold Sampling and Estimation
Sampling is a fundamental problem in both computer science and statistics. A
number of issues arise when designing a method based on sampling. These include
statistical considerations such as constructing a good sampling design and
ensuring there are good, tractable estimators for the quantities of interest as
well as computational considerations such as designing fast algorithms for
streaming data and ensuring the sample fits within memory constraints.
Unfortunately, existing sampling methods are only able to address all of these
issues in limited scenarios.
We develop a framework that can be used to address these issues in a broad
range of scenarios. In particular, it addresses the problem of drawing and
using samples under some memory budget constraint. This problem can be
challenging since the memory budget forces samples to be drawn
non-independently and consequently, makes computation of resulting estimators
difficult.
At the core of the framework is the notion of a data adaptive thresholding
scheme where the threshold effectively allows one to treat the non-independent
sample as if it were drawn independently. We provide sufficient conditions for
a thresholding scheme to allow this and provide ways to build and compose such
schemes.
Furthermore, we provide fast algorithms to efficiently sample under these
thresholding schemes
- …