938 research outputs found
Dynamic Sampling from a Discrete Probability Distribution with a Known Distribution of Rates
In this paper, we consider a number of efficient data structures for the
problem of sampling from a dynamically changing discrete probability
distribution, where some prior information is known on the distribution of the
rates, in particular the maximum and minimum rate, and where the number of
possible outcomes N is large.
We consider three basic data structures, the Acceptance-Rejection method, the
Complete Binary Tree and the Alias Method. These can be used as building blocks
in a multi-level data structure, where at each of the levels, one of the basic
data structures can be used.
Depending on assumptions on the distribution of the rates of outcomes,
different combinations of the basic structures can be used. We prove that for
particular data structures the expected time of sampling and update is
constant, when the rates follow a non-decreasing distribution, log-uniform
distribution or an inverse polynomial distribution, and show that for any
distribution, an expected time of sampling and update of
is possible, where is the
maximum rate and the minimum rate.
We also present an experimental verification, highlighting the limits given
by the constraints of a real-life setting
-MLE: A fast algorithm for learning statistical mixture models
We describe -MLE, a fast and efficient local search algorithm for learning
finite statistical mixtures of exponential families such as Gaussian mixture
models. Mixture models are traditionally learned using the
expectation-maximization (EM) soft clustering technique that monotonically
increases the incomplete (expected complete) likelihood. Given prescribed
mixture weights, the hard clustering -MLE algorithm iteratively assigns data
to the most likely weighted component and update the component models using
Maximum Likelihood Estimators (MLEs). Using the duality between exponential
families and Bregman divergences, we prove that the local convergence of the
complete likelihood of -MLE follows directly from the convergence of a dual
additively weighted Bregman hard clustering. The inner loop of -MLE can be
implemented using any -means heuristic like the celebrated Lloyd's batched
or Hartigan's greedy swap updates. We then show how to update the mixture
weights by minimizing a cross-entropy criterion that implies to update weights
by taking the relative proportion of cluster points, and reiterate the mixture
parameter update and mixture weight update processes until convergence. Hard EM
is interpreted as a special case of -MLE when both the component update and
the weight update are performed successively in the inner loop. To initialize
-MLE, we propose -MLE++, a careful initialization of -MLE guaranteeing
probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201
Hybrid PDE solver for data-driven problems and modern branching
The numerical solution of large-scale PDEs, such as those occurring in
data-driven applications, unavoidably require powerful parallel computers and
tailored parallel algorithms to make the best possible use of them. In fact,
considerations about the parallelization and scalability of realistic problems
are often critical enough to warrant acknowledgement in the modelling phase.
The purpose of this paper is to spread awareness of the Probabilistic Domain
Decomposition (PDD) method, a fresh approach to the parallelization of PDEs
with excellent scalability properties. The idea exploits the stochastic
representation of the PDE and its approximation via Monte Carlo in combination
with deterministic high-performance PDE solvers. We describe the ingredients of
PDD and its applicability in the scope of data science. In particular, we
highlight recent advances in stochastic representations for nonlinear PDEs
using branching diffusions, which have significantly broadened the scope of
PDD.
We envision this work as a dictionary giving large-scale PDE practitioners
references on the very latest algorithms and techniques of a non-standard, yet
highly parallelizable, methodology at the interface of deterministic and
probabilistic numerical methods. We close this work with an invitation to the
fully nonlinear case and open research questions.Comment: 23 pages, 7 figures; Final SMUR version; To appear in the European
Journal of Applied Mathematics (EJAM
An Information Theoretic Charachterization of Channel Shortening Receivers
Optimal data detection of data transmitted over a linear channel can always
be implemented through the Viterbi algorithm (VA). However, in many cases of
interest the memory of the channel prohibits application of the VA. A popular
and conceptually simple method in this case, studied since the early 70s, is to
first filter the received signal in order to shorten the memory of the channel,
and then to apply a VA that operates with the shorter memory. We shall refer to
this as a channel shortening (CS) receiver. Although studied for almost four
decades, an information theoretic understanding of what such a simple receiver
solution is actually doing is not available.
In this paper we will show that an optimized CS receiver is implementing the
chain rule of mutual information, but only up to the shortened memory that the
receiver is operating with. Further, we will show that the tools for analyzing
the ensuing achievable rates from an optimized CS receiver are precisely the
same as those used for analyzing the achievable rates of a minimum mean square
error (MMSE) receiver
Accelerating delayed-acceptance Markov chain Monte Carlo algorithms
Delayed-acceptance Markov chain Monte Carlo (DA-MCMC) samples from a
probability distribution via a two-stages version of the Metropolis-Hastings
algorithm, by combining the target distribution with a "surrogate" (i.e. an
approximate and computationally cheaper version) of said distribution. DA-MCMC
accelerates MCMC sampling in complex applications, while still targeting the
exact distribution. We design a computationally faster, albeit approximate,
DA-MCMC algorithm. We consider parameter inference in a Bayesian setting where
a surrogate likelihood function is introduced in the delayed-acceptance scheme.
When the evaluation of the likelihood function is computationally intensive,
our scheme produces a 2-4 times speed-up, compared to standard DA-MCMC.
However, the acceleration is highly problem dependent. Inference results for
the standard delayed-acceptance algorithm and our approximated version are
similar, indicating that our algorithm can return reliable Bayesian inference.
As a computationally intensive case study, we introduce a novel stochastic
differential equation model for protein folding data.Comment: 40 pages, 21 figures, 10 table
- …