2,311 research outputs found
Learning Arbitrary Statistical Mixtures of Discrete Distributions
We study the problem of learning from unlabeled samples very general
statistical mixture models on large finite sets. Specifically, the model to be
learned, , is a probability distribution over probability
distributions , where each such is a probability distribution over . When we sample from , we do not observe
directly, but only indirectly and in very noisy fashion, by sampling from
repeatedly, independently times from the distribution . The problem is
to infer to high accuracy in transportation (earthmover) distance.
We give the first efficient algorithms for learning this mixture model
without making any restricting assumptions on the structure of the distribution
. We bound the quality of the solution as a function of the size of
the samples and the number of samples used. Our model and results have
applications to a variety of unsupervised learning scenarios, including
learning topic models and collaborative filtering.Comment: 23 pages. Preliminary version in the Proceeding of the 47th ACM
Symposium on the Theory of Computing (STOC15
An invitation to quantum tomography (II)
The quantum state of a light beam can be represented as an infinite
dimensional density matrix or equivalently as a density on the plane called the
Wigner function. We describe quantum tomography as an inverse statistical
problem in which the state is the unknown parameter and the data is given by
results of measurements performed on identical quantum systems. We present
consistency results for Pattern Function Projection Estimators as well as for
Sieve Maximum Likelihood Estimators for both the density matrix of the quantum
state and its Wigner function. Finally we illustrate via simulated data the
performance of the estimators. An EM algorithm is proposed for practical
implementation. There remain many open problems, e.g. rates of convergence,
adaptation, studying other estimators, etc., and a main purpose of the paper is
to bring these to the attention of the statistical community.Comment: An earlier version of this paper with more mathematical background
but less applied statistical content can be found on arXiv as
quant-ph/0303020. An electronic version of the paper with high resolution
figures (postscript instead of bitmaps) is available from the authors. v2:
added cross-validation results, reference
On the use of reproducing kernel Hilbert spaces in functional classification
The H\'ajek-Feldman dichotomy establishes that two Gaussian measures are
either mutually absolutely continuous with respect to each other (and hence
there is a Radon-Nikodym density for each measure with respect to the other
one) or mutually singular. Unlike the case of finite dimensional Gaussian
measures, there are non-trivial examples of both situations when dealing with
Gaussian stochastic processes. This paper provides:
(a) Explicit expressions for the optimal (Bayes) rule and the minimal
classification error probability in several relevant problems of supervised
binary classification of mutually absolutely continuous Gaussian processes. The
approach relies on some classical results in the theory of Reproducing Kernel
Hilbert Spaces (RKHS).
(b) An interpretation, in terms of mutual singularity, for the "near perfect
classification" phenomenon described by Delaigle and Hall (2012). We show that
the asymptotically optimal rule proposed by these authors can be identified
with the sequence of optimal rules for an approximating sequence of
classification problems in the absolutely continuous case.
(c) A new model-based method for variable selection in binary classification
problems, which arises in a very natural way from the explicit knowledge of the
RN-derivatives and the underlying RKHS structure. Different classifiers might
be used from the selected variables. In particular, the classical, linear
finite-dimensional Fisher rule turns out to be consistent under some standard
conditions on the underlying functional model
Structural Variability from Noisy Tomographic Projections
In cryo-electron microscopy, the 3D electric potentials of an ensemble of
molecules are projected along arbitrary viewing directions to yield noisy 2D
images. The volume maps representing these potentials typically exhibit a great
deal of structural variability, which is described by their 3D covariance
matrix. Typically, this covariance matrix is approximately low-rank and can be
used to cluster the volumes or estimate the intrinsic geometry of the
conformation space. We formulate the estimation of this covariance matrix as a
linear inverse problem, yielding a consistent least-squares estimator. For
images of size -by- pixels, we propose an algorithm for calculating this
covariance estimator with computational complexity
, where the condition number
is empirically in the range --. Its efficiency relies on the
observation that the normal equations are equivalent to a deconvolution problem
in 6D. This is then solved by the conjugate gradient method with an appropriate
circulant preconditioner. The result is the first computationally efficient
algorithm for consistent estimation of 3D covariance from noisy projections. It
also compares favorably in runtime with respect to previously proposed
non-consistent estimators. Motivated by the recent success of eigenvalue
shrinkage procedures for high-dimensional covariance matrices, we introduce a
shrinkage procedure that improves accuracy at lower signal-to-noise ratios. We
evaluate our methods on simulated datasets and achieve classification results
comparable to state-of-the-art methods in shorter running time. We also present
results on clustering volumes in an experimental dataset, illustrating the
power of the proposed algorithm for practical determination of structural
variability.Comment: 52 pages, 11 figure
- …