87,608 research outputs found
Decomposition tables for experiments I. A chain of randomizations
One aspect of evaluating the design for an experiment is the discovery of the
relationships between subspaces of the data space. Initially we establish the
notation and methods for evaluating an experiment with a single randomization.
Starting with two structures, or orthogonal decompositions of the data space,
we describe how to combine them to form the overall decomposition for a
single-randomization experiment that is ``structure balanced.'' The
relationships between the two structures are characterized using efficiency
factors. The decomposition is encapsulated in a decomposition table. Then, for
experiments that involve multiple randomizations forming a chain, we take
several structures that pairwise are structure balanced and combine them to
establish the form of the orthogonal decomposition for the experiment. In
particular, it is proven that the properties of the design for such an
experiment are derived in a straightforward manner from those of the individual
designs. We show how to formulate an extended decomposition table giving the
sources of variation, their relationships and their degrees of freedom, so that
competing designs can be evaluated.Comment: Published in at http://dx.doi.org/10.1214/09-AOS717 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Randomized Algorithms for Tracking Distributed Count, Frequencies, and Ranks
We show that randomization can lead to significant improvements for a few
fundamental problems in distributed tracking. Our basis is the {\em
count-tracking} problem, where there are players, each holding a counter
that gets incremented over time, and the goal is to track an
\eps-approximation of their sum continuously at all times,
using minimum communication. While the deterministic communication complexity
of the problem is \Theta(k/\eps \cdot \log N), where is the final value
of when the tracking finishes, we show that with randomization, the
communication cost can be reduced to \Theta(\sqrt{k}/\eps \cdot \log N). Our
algorithm is simple and uses only O(1) space at each player, while the lower
bound holds even assuming each player has infinite computing power. Then, we
extend our techniques to two related distributed tracking problems: {\em
frequency-tracking} and {\em rank-tracking}, and obtain similar improvements
over previous deterministic algorithms. Both problems are of central importance
in large data monitoring and analysis, and have been extensively studied in the
literature.Comment: 19 pages, 1 figur
- …