7,151 research outputs found
Optimized Large-Scale CMB Likelihood And Quadratic Maximum Likelihood Power Spectrum Estimation
We revisit the problem of exact CMB likelihood and power spectrum estimation
with the goal of minimizing computational cost through linear compression. This
idea was originally proposed for CMB purposes by Tegmark et al.\ (1997), and
here we develop it into a fully working computational framework for large-scale
polarization analysis, adopting \WMAP\ as a worked example. We compare five
different linear bases (pixel space, harmonic space, noise covariance
eigenvectors, signal-to-noise covariance eigenvectors and signal-plus-noise
covariance eigenvectors) in terms of compression efficiency, and find that the
computationally most efficient basis is the signal-to-noise eigenvector basis,
which is closely related to the Karhunen-Loeve and Principal Component
transforms, in agreement with previous suggestions. For this basis, the
information in 6836 unmasked \WMAP\ sky map pixels can be compressed into a
smaller set of 3102 modes, with a maximum error increase of any single
multipole of 3.8\% at , and a maximum shift in the mean values of a
joint distribution of an amplitude--tilt model of 0.006. This
compression reduces the computational cost of a single likelihood evaluation by
a factor of 5, from 38 to 7.5 CPU seconds, and it also results in a more robust
likelihood by implicitly regularizing nearly degenerate modes. Finally, we use
the same compression framework to formulate a numerically stable and
computationally efficient variation of the Quadratic Maximum Likelihood
implementation that requires less than 3 GB of memory and 2 CPU minutes per
iteration for , rendering low- QML CMB power spectrum
analysis fully tractable on a standard laptop.Comment: 13 pages, 13 figures, accepted by ApJ
Sketching for Large-Scale Learning of Mixture Models
Learning parameters from voluminous data can be prohibitive in terms of
memory and computational requirements. We propose a "compressive learning"
framework where we estimate model parameters from a sketch of the training
data. This sketch is a collection of generalized moments of the underlying
probability distribution of the data. It can be computed in a single pass on
the training set, and is easily computable on streams or distributed datasets.
The proposed framework shares similarities with compressive sensing, which aims
at drastically reducing the dimension of high-dimensional signals while
preserving the ability to reconstruct them. To perform the estimation task, we
derive an iterative algorithm analogous to sparse reconstruction algorithms in
the context of linear inverse problems. We exemplify our framework with the
compressive estimation of a Gaussian Mixture Model (GMM), providing heuristics
on the choice of the sketching procedure and theoretical guarantees of
reconstruction. We experimentally show on synthetic data that the proposed
algorithm yields results comparable to the classical Expectation-Maximization
(EM) technique while requiring significantly less memory and fewer computations
when the number of database elements is large. We further demonstrate the
potential of the approach on real large-scale data (over 10 8 training samples)
for the task of model-based speaker verification. Finally, we draw some
connections between the proposed framework and approximate Hilbert space
embedding of probability distributions using random features. We show that the
proposed sketching operator can be seen as an innovative method to design
translation-invariant kernels adapted to the analysis of GMMs. We also use this
theoretical framework to derive information preservation guarantees, in the
spirit of infinite-dimensional compressive sensing
Statistical framework for video decoding complexity modeling and prediction
Video decoding complexity modeling and prediction is an increasingly important issue for efficient resource utilization in a variety of applications, including task scheduling, receiver-driven complexity shaping, and adaptive dynamic voltage scaling. In this paper we present a novel view of this problem based on a statistical framework perspective. We explore the statistical structure (clustering) of the execution time required by each video decoder module (entropy decoding, motion compensation, etc.) in conjunction with complexity features that are easily extractable at encoding time (representing the properties of each module's input source data). For this purpose, we employ Gaussian mixture models (GMMs) and an expectation-maximization algorithm to estimate the joint execution-time - feature probability density function (PDF). A training set of typical video sequences is used for this purpose in an offline estimation process. The obtained GMM representation is used in conjunction with the complexity features of new video sequences to predict the execution time required for the decoding of these sequences. Several prediction approaches are discussed and compared. The potential mismatch between the training set and new video content is addressed by adaptive online joint-PDF re-estimation. An experimental comparison is performed to evaluate the different approaches and compare the proposed prediction scheme with related resource prediction schemes from the literature. The usefulness of the proposed complexity-prediction approaches is demonstrated in an application of rate-distortion-complexity optimized decoding
- âŠ