9,132 research outputs found
Double Bootstrap Confidence Intervals in the Two-Stage DEA Approach
Contextual factors usually assume an important role in determining firms' productive efficiencies. Nevertheless, identifying them in a regression framework might be complicated. The problem arises from the efficiencies being correlated with each other when estimated by Data Envelopment Analysis, rendering standard inference methods invalid. Simar and Wilson (2007) suggest the use of bootstrap algorithms that allow for valid statistical inference in this context. This article extends their work by proposing a double bootstrap algorithm for obtaining confidence intervals with improved coverage probabilities. Moreover, acknowledging the computational burden associated with iterated bootstrap procedures, we provide an algorithm based on deterministic stopping rules, which is less computationally demanding. Monte Carlo evidence shows considerable improvement in the coverage probabilities after iterating the bootstrap procedure. The results also suggest that percentile confidence intervals perform better than their basic counterpart
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
This paper introduces an open-ended sequential algorithm for computing the
p-value of a test using Monte Carlo simulation. It guarantees that the
resampling risk, the probability of a different decision than the one based on
the theoretical p-value, is uniformly bounded by an arbitrarily small constant.
Previously suggested sequential or non-sequential algorithms, using a bounded
sample size, do not have this property. Although the algorithm is open-ended,
the expected number of steps is finite, except when the p-value is on the
threshold between rejecting and not rejecting. The algorithm is suitable as
standard for implementing tests that require (re-)sampling. It can also be used
in other situations: to check whether a test is conservative, iteratively to
implement double bootstrap tests, and to determine the sample size required for
a certain power.Comment: Major Revision 15 pages, 4 figure
On the Inversion of High Energy Proton
Inversion of the K-fold stochastic autoconvolution integral equation is an
elementary nonlinear problem, yet there are no de facto methods to solve it
with finite statistics. To fix this problem, we introduce a novel inverse
algorithm based on a combination of minimization of relative entropy, the Fast
Fourier Transform and a recursive version of Efron's bootstrap. This gives us
power to obtain new perspectives on non-perturbative high energy QCD, such as
probing the ab initio principles underlying the approximately negative binomial
distributions of observed charged particle final state multiplicities, related
to multiparton interactions, the fluctuating structure and profile of proton
and diffraction. As a proof-of-concept, we apply the algorithm to ALICE
proton-proton charged particle multiplicity measurements done at different
center-of-mass energies and fiducial pseudorapidity intervals at the LHC,
available on HEPData. A strong double peak structure emerges from the
inversion, barely visible without it.Comment: 29 pages, 10 figures, v2: extended analysis (re-projection ratios,
2D
A subsampled double bootstrap for massive data
The bootstrap is a popular and powerful method for assessing precision of
estimators and inferential methods. However, for massive datasets which are
increasingly prevalent, the bootstrap becomes prohibitively costly in
computation and its feasibility is questionable even with modern parallel
computing platforms. Recently Kleiner, Talwalkar, Sarkar, and Jordan (2014)
proposed a method called BLB (Bag of Little Bootstraps) for massive data which
is more computationally scalable with little sacrifice of statistical accuracy.
Building on BLB and the idea of fast double bootstrap, we propose a new
resampling method, the subsampled double bootstrap, for both independent data
and time series data. We establish consistency of the subsampled double
bootstrap under mild conditions for both independent and dependent cases.
Methodologically, the subsampled double bootstrap is superior to BLB in terms
of running time, more sample coverage and automatic implementation with less
tuning parameters for a given time budget. Its advantage relative to BLB and
bootstrap is also demonstrated in numerical simulations and a data
illustration
- …