5,410 research outputs found
Bayesian adaptation
In the need for low assumption inferential methods in infinite-dimensional
settings, Bayesian adaptive estimation via a prior distribution that does not
depend on the regularity of the function to be estimated nor on the sample size
is valuable. We elucidate relationships among the main approaches followed to
design priors for minimax-optimal rate-adaptive estimation meanwhile shedding
light on the underlying ideas.Comment: 20 pages, Propositions 3 and 5 adde
A semi-parametric model for circular data based on mixtures of beta distributions
This paper introduces a new, semi-parametric model for circular data, based on mixtures of
shifted, scaled, beta (SSB) densities. This model is more general than the Bernstein polynomial
density model which is well known to provide good approximations to any density with finite
support and it is shown that, as for the Bernstein polynomial model, the trigonometric moments of
the SSB mixture model can all be derived.
Two methods of fitting the SSB mixture model are considered. Firstly, a classical, maximum
likelihood approach for fitting mixtures of a given number of SSB components is introduced. The
Bayesian information criterion is then used for model selection. Secondly, a Bayesian approach
using Gibbs sampling is considered. In this case, the number of mixture components is selected
via an appropriate deviance information criterion.
Both approaches are illustrated with real data sets and the results are compared with those
obtained using Bernstein polynomials and mixtures of von Mises distributions
A semi-parametric model for circular data based on mixtures of beta distributions
This paper introduces a new, semi-parametric model for circular data, based on mixtures of shifted, scaled, beta (SSB) densities. This model is more general than the Bernstein polynomial density model which is well known to provide good approximations to any density with finite support and it is shown that, as for the Bernstein polynomial model, the trigonometric moments of the SSB mixture model can all be derived. Two methods of fitting the SSB mixture model are considered. Firstly, a classical, maximum likelihood approach for fitting mixtures of a given number of SSB components is introduced. The Bayesian information criterion is then used for model selection. Secondly, a Bayesian approach using Gibbs sampling is considered. In this case, the number of mixture components is selected via an appropriate deviance information criterion. Both approaches are illustrated with real data sets and the results are compared with those obtained using Bernstein polynomials and mixtures of von Mises distributions.Circular data, Shifted, scaled, beta distribution; Mixture models, Bernstein polynomials
How Many Subpopulations is Too Many? Exponential Lower Bounds for Inferring Population Histories
Reconstruction of population histories is a central problem in population
genetics. Existing coalescent-based methods, like the seminal work of Li and
Durbin (Nature, 2011), attempt to solve this problem using sequence data but
have no rigorous guarantees. Determining the amount of data needed to correctly
reconstruct population histories is a major challenge. Using a variety of tools
from information theory, the theory of extremal polynomials, and approximation
theory, we prove new sharp information-theoretic lower bounds on the problem of
reconstructing population structure -- the history of multiple subpopulations
that merge, split and change sizes over time. Our lower bounds are exponential
in the number of subpopulations, even when reconstructing recent histories. We
demonstrate the sharpness of our lower bounds by providing algorithms for
distinguishing and learning population histories with matching dependence on
the number of subpopulations. Along the way and of independent interest, we
essentially determine the optimal number of samples needed to learn an
exponential mixture distribution information-theoretically, proving the upper
bound by analyzing natural (and efficient) algorithms for this problem.Comment: 38 pages, Appeared in RECOMB 201
Nonparametric estimation of the mixing density using polynomials
We consider the problem of estimating the mixing density from i.i.d.
observations distributed according to a mixture density with unknown mixing
distribution. In contrast with finite mixtures models, here the distribution of
the hidden variable is not bounded to a finite set but is spread out over a
given interval. We propose an approach to construct an orthogonal series
estimator of the mixing density involving Legendre polynomials. The
construction of the orthonormal sequence varies from one mixture model to
another. Minimax upper and lower bounds of the mean integrated squared error
are provided which apply in various contexts. In the specific case of
exponential mixtures, it is shown that the estimator is adaptive over a
collection of specific smoothness classes, more precisely, there exists a
constant A\textgreater{}0 such that, when the order of the projection
estimator verifies , the estimator achieves the minimax rate
over this collection. Other cases are investigated such as Gamma shape mixtures
and scale mixtures of compactly supported densities including Beta mixtures.
Finally, a consistent estimator of the support of the mixing density is
provided
Learning the Number of Autoregressive Mixtures in Time Series Using the Gap Statistics
Using a proper model to characterize a time series is crucial in making
accurate predictions. In this work we use time-varying autoregressive process
(TVAR) to describe non-stationary time series and model it as a mixture of
multiple stable autoregressive (AR) processes. We introduce a new model
selection technique based on Gap statistics to learn the appropriate number of
AR filters needed to model a time series. We define a new distance measure
between stable AR filters and draw a reference curve that is used to measure
how much adding a new AR filter improves the performance of the model, and then
choose the number of AR filters that has the maximum gap with the reference
curve. To that end, we propose a new method in order to generate uniform random
stable AR filters in root domain. Numerical results are provided demonstrating
the performance of the proposed approach.Comment: This paper has been accepted by 2015 IEEE International Conference on
Data Minin
A Method of Moments for Mixture Models and Hidden Markov Models
Mixture models are a fundamental tool in applied statistics and machine
learning for treating data taken from multiple subpopulations. The current
practice for estimating the parameters of such models relies on local search
heuristics (e.g., the EM algorithm) which are prone to failure, and existing
consistent methods are unfavorable due to their high computational and sample
complexity which typically scale exponentially with the number of mixture
components. This work develops an efficient method of moments approach to
parameter estimation for a broad class of high-dimensional mixture models with
many components, including multi-view mixtures of Gaussians (such as mixtures
of axis-aligned Gaussians) and hidden Markov models. The new method leads to
rigorous unsupervised learning results for mixture models that were not
achieved by previous works; and, because of its simplicity, it offers a viable
alternative to EM for practical deployment
- …