107,999 research outputs found
Bayes and maximum likelihood for -Wasserstein deconvolution of Laplace mixtures
We consider the problem of recovering a distribution function on the real
line from observations additively contaminated with errors following the
standard Laplace distribution. Assuming that the latent distribution is
completely unknown leads to a nonparametric deconvolution problem. We begin by
studying the rates of convergence relative to the -norm and the Hellinger
metric for the direct problem of estimating the sampling density, which is a
mixture of Laplace densities with a possibly unbounded set of locations: the
rate of convergence for the Bayes' density estimator corresponding to a
Dirichlet process prior over the space of all mixing distributions on the real
line matches, up to a logarithmic factor, with the rate
for the maximum likelihood estimator. Then, appealing to an inversion
inequality translating the -norm and the Hellinger distance between
general kernel mixtures, with a kernel density having polynomially decaying
Fourier transform, into any -Wasserstein distance, , between the
corresponding mixing distributions, provided their Laplace transforms are
finite in some neighborhood of zero, we derive the rates of convergence in the
-Wasserstein metric for the Bayes' and maximum likelihood estimators of
the mixing distribution. Merging in the -Wasserstein distance between
Bayes and maximum likelihood follows as a by-product, along with an assessment
on the stochastic order of the discrepancy between the two estimation
procedures
Fast performance estimation of block codes
Importance sampling is used in this paper to address the classical yet important problem of performance estimation of block codes. Simulation distributions that comprise discreteand continuous-mixture probability densities are motivated and used for this application. These mixtures are employed in concert with the so-called g-method, which is a conditional importance sampling technique that more effectively exploits knowledge of underlying input distributions. For performance estimation, the emphasis is on bit by bit maximum a-posteriori probability decoding, but message passing algorithms for certain codes have also been investigated. Considered here are single parity check codes, multidimensional product codes, and briefly, low-density parity-check codes. Several error rate results are presented for these various codes, together with performances of the simulation techniques
Network Tomography: Identifiability and Fourier Domain Estimation
The statistical problem for network tomography is to infer the distribution
of , with mutually independent components, from a measurement model
, where is a given binary matrix representing the
routing topology of a network under consideration. The challenge is that the
dimension of is much larger than that of and thus the
problem is often called ill-posed. This paper studies some statistical aspects
of network tomography. We first address the identifiability issue and prove
that the distribution is identifiable up to a shift parameter
under mild conditions. We then use a mixture model of characteristic functions
to derive a fast algorithm for estimating the distribution of
based on the General method of Moments. Through extensive model simulation and
real Internet trace driven simulation, the proposed approach is shown to be
favorable comparing to previous methods using simple discretization for
inferring link delays in a heterogeneous network.Comment: 21 page
Approximate Profile Maximum Likelihood
We propose an efficient algorithm for approximate computation of the profile
maximum likelihood (PML), a variant of maximum likelihood maximizing the
probability of observing a sufficient statistic rather than the empirical
sample. The PML has appealing theoretical properties, but is difficult to
compute exactly. Inspired by observations gleaned from exactly solvable cases,
we look for an approximate PML solution, which, intuitively, clumps comparably
frequent symbols into one symbol. This amounts to lower-bounding a certain
matrix permanent by summing over a subgroup of the symmetric group rather than
the whole group during the computation. We extensively experiment with the
approximate solution, and find the empirical performance of our approach is
competitive and sometimes significantly better than state-of-the-art
performance for various estimation problems
A Method of Moments for Mixture Models and Hidden Markov Models
Mixture models are a fundamental tool in applied statistics and machine
learning for treating data taken from multiple subpopulations. The current
practice for estimating the parameters of such models relies on local search
heuristics (e.g., the EM algorithm) which are prone to failure, and existing
consistent methods are unfavorable due to their high computational and sample
complexity which typically scale exponentially with the number of mixture
components. This work develops an efficient method of moments approach to
parameter estimation for a broad class of high-dimensional mixture models with
many components, including multi-view mixtures of Gaussians (such as mixtures
of axis-aligned Gaussians) and hidden Markov models. The new method leads to
rigorous unsupervised learning results for mixture models that were not
achieved by previous works; and, because of its simplicity, it offers a viable
alternative to EM for practical deployment
- …