35,309 research outputs found
Unbiased and Consistent Nested Sampling via Sequential Monte Carlo
We introduce a new class of sequential Monte Carlo methods called Nested
Sampling via Sequential Monte Carlo (NS-SMC), which reframes the Nested
Sampling method of Skilling (2006) in terms of sequential Monte Carlo
techniques. This new framework allows convergence results to be obtained in the
setting when Markov chain Monte Carlo (MCMC) is used to produce new samples. An
additional benefit is that marginal likelihood estimates are unbiased. In
contrast to NS, the analysis of NS-SMC does not require the (unrealistic)
assumption that the simulated samples be independent. As the original NS
algorithm is a special case of NS-SMC, this provides insights as to why NS
seems to produce accurate estimates despite a typical violation of its
assumptions. For applications of NS-SMC, we give advice on tuning MCMC kernels
in an automated manner via a preliminary pilot run, and present a new method
for appropriately choosing the number of MCMC repeats at each iteration.
Finally, a numerical study is conducted where the performance of NS-SMC and
temperature-annealed SMC is compared on several challenging and realistic
problems. MATLAB code for our experiments is made available at
https://github.com/LeahPrice/SMC-NS .Comment: 45 pages, some minor typographical errors fixed since last versio
On Improvement in Estimating Population Parameter(s) Using Auxiliary Information
The purpose of writing this book is to suggest some improved estimators using
auxiliary information in sampling schemes like simple random sampling and
systematic sampling.
This volume is a collection of five papers. The following problems have been
discussed in the book:
In chapter one an estimator in systematic sampling using auxiliary
information is studied in the presence of non-response. In second chapter some
improved estimators are suggested using auxiliary information. In third chapter
some improved ratio-type estimators are suggested and their properties are
studied under second order of approximation.
In chapter four and five some estimators are proposed for estimating unknown
population parameter(s) and their properties are studied.
This book will be helpful for the researchers and students who are working in
the field of finite population estimation.Comment: 63 pages, 8 tables. Educational Publishing & Journal of Matter
Regularity (Beijing
Shrinkage Estimators in Online Experiments
We develop and analyze empirical Bayes Stein-type estimators for use in the
estimation of causal effects in large-scale online experiments. While online
experiments are generally thought to be distinguished by their large sample
size, we focus on the multiplicity of treatment groups. The typical analysis
practice is to use simple differences-in-means (perhaps with covariate
adjustment) as if all treatment arms were independent. In this work we develop
consistent, small bias, shrinkage estimators for this setting. In addition to
achieving lower mean squared error these estimators retain important
frequentist properties such as coverage under most reasonable scenarios. Modern
sequential methods of experimentation and optimization such as multi-armed
bandit optimization (where treatment allocations adapt over time to prior
responses) benefit from the use of our shrinkage estimators. Exploration under
empirical Bayes focuses more efficiently on near-optimal arms, improving the
resulting decisions made under uncertainty. We demonstrate these properties by
examining seventeen large-scale experiments conducted on Facebook from April to
June 2017
2.5K-Graphs: from Sampling to Generation
Understanding network structure and having access to realistic graphs plays a
central role in computer and social networks research. In this paper, we
propose a complete, and practical methodology for generating graphs that
resemble a real graph of interest. The metrics of the original topology we
target to match are the joint degree distribution (JDD) and the
degree-dependent average clustering coefficient (). We start by
developing efficient estimators for these two metrics based on a node sample
collected via either independence sampling or random walks. Then, we process
the output of the estimators to ensure that the target properties are
realizable. Finally, we propose an efficient algorithm for generating
topologies that have the exact target JDD and a close to the
target. Extensive simulations using real-life graphs show that the graphs
generated by our methodology are similar to the original graph with respect to,
not only the two target metrics, but also a wide range of other topological
metrics; furthermore, our generator is order of magnitudes faster than
state-of-the-art techniques
Combining multiple observational data sources to estimate causal effects
The era of big data has witnessed an increasing availability of multiple data
sources for statistical analyses. We consider estimation of causal effects
combining big main data with unmeasured confounders and smaller validation data
with supplementary information on these confounders. Under the unconfoundedness
assumption with completely observed confounders, the smaller validation data
allow for constructing consistent estimators for causal effects, but the big
main data can only give error-prone estimators in general. However, by
leveraging the information in the big main data in a principled way, we can
improve the estimation efficiencies yet preserve the consistencies of the
initial estimators based solely on the validation data. Our framework applies
to asymptotically normal estimators, including the commonly-used regression
imputation, weighting, and matching estimators, and does not require a correct
specification of the model relating the unmeasured confounders to the observed
variables. We also propose appropriate bootstrap procedures, which makes our
method straightforward to implement using software routines for existing
estimators
- …