2,458 research outputs found
Anytime Hierarchical Clustering
We propose a new anytime hierarchical clustering method that iteratively
transforms an arbitrary initial hierarchy on the configuration of measurements
along a sequence of trees we prove for a fixed data set must terminate in a
chain of nested partitions that satisfies a natural homogeneity requirement.
Each recursive step re-edits the tree so as to improve a local measure of
cluster homogeneity that is compatible with a number of commonly used (e.g.,
single, average, complete) linkage functions. As an alternative to the standard
batch algorithms, we present numerical evidence to suggest that appropriate
adaptations of this method can yield decentralized, scalable algorithms
suitable for distributed/parallel computation of clustering hierarchies and
online tracking of clustering trees applicable to large, dynamically changing
databases and anomaly detection.Comment: 13 pages, 6 figures, 5 tables, in preparation for submission to a
conferenc
Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning
There is growing interest in estimating and analyzing heterogeneous treatment
effects in experimental and observational studies. We describe a number of
meta-algorithms that can take advantage of any supervised learning or
regression method in machine learning and statistics to estimate the
Conditional Average Treatment Effect (CATE) function. Meta-algorithms build on
base algorithms---such as Random Forests (RF), Bayesian Additive Regression
Trees (BART) or neural networks---to estimate the CATE, a function that the
base algorithms are not designed to estimate directly. We introduce a new
meta-algorithm, the X-learner, that is provably efficient when the number of
units in one treatment group is much larger than in the other, and can exploit
structural properties of the CATE function. For example, if the CATE function
is linear and the response functions in treatment and control are Lipschitz
continuous, the X-learner can still achieve the parametric rate under
regularity conditions. We then introduce versions of the X-learner that use RF
and BART as base learners. In extensive simulation studies, the X-learner
performs favorably, although none of the meta-learners is uniformly the best.
In two persuasion field experiments from political science, we demonstrate how
our new X-learner can be used to target treatment regimes and to shed light on
underlying mechanisms. A software package is provided that implements our
methods
Mean Estimation from One-Bit Measurements
We consider the problem of estimating the mean of a symmetric log-concave
distribution under the constraint that only a single bit per sample from this
distribution is available to the estimator. We study the mean squared error as
a function of the sample size (and hence the number of bits). We consider three
settings: first, a centralized setting, where an encoder may release bits
given a sample of size , and for which there is no asymptotic penalty for
quantization; second, an adaptive setting in which each bit is a function of
the current observation and previously recorded bits, where we show that the
optimal relative efficiency compared to the sample mean is precisely the
efficiency of the median; lastly, we show that in a distributed setting where
each bit is only a function of a local sample, no estimator can achieve optimal
efficiency uniformly over the parameter space. We additionally complement our
results in the adaptive setting by showing that \emph{one} round of adaptivity
is sufficient to achieve optimal mean-square error
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
Universal lossless source coding with the Burrows Wheeler transform
The Burrows Wheeler transform (1994) is a reversible sequence transformation used in a variety of practical lossless source-coding algorithms. In each, the BWT is followed by a lossless source code that attempts to exploit the natural ordering of the BWT coefficients. BWT-based compression schemes are widely touted as low-complexity algorithms giving lossless coding rates better than those of the Ziv-Lempel codes (commonly known as LZ'77 and LZ'78) and almost as good as those achieved by prediction by partial matching (PPM) algorithms. To date, the coding performance claims have been made primarily on the basis of experimental results. This work gives a theoretical evaluation of BWT-based coding. The main results of this theoretical evaluation include: (1) statistical characterizations of the BWT output on both finite strings and sequences of length n â â, (2) a variety of very simple new techniques for BWT-based lossless source coding, and (3) proofs of the universality and bounds on the rates of convergence of both new and existing BWT-based codes for finite-memory and stationary ergodic sources. The end result is a theoretical justification and validation of the experimentally derived conclusions: BWT-based lossless source codes achieve universal lossless coding performance that converges to the optimal coding performance more quickly than the rate of convergence observed in Ziv-Lempel style codes and, for some BWT-based codes, within a constant factor of the optimal rate of convergence for finite-memory source
- âŠ