37,394 research outputs found
WARP: Wavelets with adaptive recursive partitioning for multi-dimensional data
Effective identification of asymmetric and local features in images and other
data observed on multi-dimensional grids plays a critical role in a wide range
of applications including biomedical and natural image processing. Moreover,
the ever increasing amount of image data, in terms of both the resolution per
image and the number of images processed per application, requires algorithms
and methods for such applications to be computationally efficient. We develop a
new probabilistic framework for multi-dimensional data to overcome these
challenges through incorporating data adaptivity into discrete wavelet
transforms, thereby allowing them to adapt to the geometric structure of the
data while maintaining the linear computational scalability. By exploiting a
connection between the local directionality of wavelet transforms and recursive
dyadic partitioning on the grid points of the observation, we obtain the
desired adaptivity through adding to the traditional Bayesian wavelet
regression framework an additional layer of Bayesian modeling on the space of
recursive partitions over the grid points. We derive the corresponding
inference recipe in the form of a recursive representation of the exact
posterior, and develop a class of efficient recursive message passing
algorithms for achieving exact Bayesian inference with a computational
complexity linear in the resolution and sample size of the images. While our
framework is applicable to a range of problems including multi-dimensional
signal processing, compression, and structural learning, we illustrate its work
and evaluate its performance in the context of 2D and 3D image reconstruction
using real images from the ImageNet database. We also apply the framework to
analyze a data set from retinal optical coherence tomography
Fast Multidimensional Entropy Estimation by k-d Partitioning
(c) 2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other work
Recursive Partitioning for Heterogeneous Causal Effects
In this paper we study the problems of estimating heterogeneity in causal
effects in experimental or observational studies and conducting inference about
the magnitude of the differences in treatment effects across subsets of the
population. In applications, our method provides a data-driven approach to
determine which subpopulations have large or small treatment effects and to
test hypotheses about the differences in these effects. For experiments, our
method allows researchers to identify heterogeneity in treatment effects that
was not specified in a pre-analysis plan, without concern about invalidating
inference due to multiple testing. In most of the literature on supervised
machine learning (e.g. regression trees, random forests, LASSO, etc.), the goal
is to build a model of the relationship between a unit's attributes and an
observed outcome. A prominent role in these methods is played by
cross-validation which compares predictions to actual outcomes in test samples,
in order to select the level of complexity of the model that provides the best
predictive power. Our method is closely related, but it differs in that it is
tailored for predicting causal effects of a treatment rather than a unit's
outcome. The challenge is that the "ground truth" for a causal effect is not
observed for any individual unit: we observe the unit with the treatment, or
without the treatment, but not both at the same time. Thus, it is not obvious
how to use cross-validation to determine whether a causal effect has been
accurately predicted. We propose several novel cross-validation criteria for
this problem and demonstrate through simulations the conditions under which
they perform better than standard methods for the problem of causal effects. We
then apply the method to a large-scale field experiment re-ranking results on a
search engine
Power Allocation for Distributed BLUE Estimation with Full and Limited Feedback of CSI
This paper investigates the problem of adaptive power allocation for
distributed best linear unbiased estimation (BLUE) of a random parameter at the
fusion center (FC) of a wireless sensor network (WSN). An optimal
power-allocation scheme is proposed that minimizes the -norm of the vector
of local transmit powers, given a maximum variance for the BLUE estimator. This
scheme results in the increased lifetime of the WSN compared to similar
approaches that are based on the minimization of the sum of the local transmit
powers. The limitation of the proposed optimal power-allocation scheme is that
it requires the feedback of the instantaneous channel state information (CSI)
from the FC to local sensors, which is not practical in most applications of
large-scale WSNs. In this paper, a limited-feedback strategy is proposed that
eliminates this requirement by designing an optimal codebook for the FC using
the generalized Lloyd algorithm with modified distortion metrics. Each sensor
amplifies its analog noisy observation using a quantized version of its optimal
amplification gain, which is received by the FC and used to estimate the
unknown parameter.Comment: 6 pages, 3 figures, to appear at the IEEE Military Communications
Conference (MILCOM) 201
Proceedings of the 2011 New York Workshop on Computer, Earth and Space Science
The purpose of the New York Workshop on Computer, Earth and Space Sciences is
to bring together the New York area's finest Astronomers, Statisticians,
Computer Scientists, Space and Earth Scientists to explore potential synergies
between their respective fields. The 2011 edition (CESS2011) was a great
success, and we would like to thank all of the presenters and participants for
attending. This year was also special as it included authors from the upcoming
book titled "Advances in Machine Learning and Data Mining for Astronomy". Over
two days, the latest advanced techniques used to analyze the vast amounts of
information now available for the understanding of our universe and our planet
were presented. These proceedings attempt to provide a small window into what
the current state of research is in this vast interdisciplinary field and we'd
like to thank the speakers who spent the time to contribute to this volume.Comment: Author lists modified. 82 pages. Workshop Proceedings from CESS 2011
in New York City, Goddard Institute for Space Studie
Numerical Fitting-based Likelihood Calculation to Speed up the Particle Filter
The likelihood calculation of a vast number of particles is the computational
bottleneck for the particle filter in applications where the observation
information is rich. For fast computing the likelihood of particles, a
numerical fitting approach is proposed to construct the Likelihood Probability
Density Function (Li-PDF) by using a comparably small number of so-called
fulcrums. The likelihood of particles is thereby analytically inferred,
explicitly or implicitly, based on the Li-PDF instead of directly computed by
utilizing the observation, which can significantly reduce the computation and
enables real time filtering. The proposed approach guarantees the estimation
quality when an appropriate fitting function and properly distributed fulcrums
are used. The details for construction of the fitting function and fulcrums are
addressed respectively in detail. In particular, to deal with multivariate
fitting, the nonparametric kernel density estimator is presented which is
flexible and convenient for implicit Li-PDF implementation. Simulation
comparison with a variety of existing approaches on a benchmark 1-dimensional
model and multi-dimensional robot localization and visual tracking demonstrate
the validity of our approach.Comment: 42 pages, 17 figures, 4 tables and 1 appendix. This paper is a
draft/preprint of one paper submitted to the IEEE Transaction
- …