5,410 research outputs found

    Bayesian adaptation

    Full text link
    In the need for low assumption inferential methods in infinite-dimensional settings, Bayesian adaptive estimation via a prior distribution that does not depend on the regularity of the function to be estimated nor on the sample size is valuable. We elucidate relationships among the main approaches followed to design priors for minimax-optimal rate-adaptive estimation meanwhile shedding light on the underlying ideas.Comment: 20 pages, Propositions 3 and 5 adde

    A semi-parametric model for circular data based on mixtures of beta distributions

    Get PDF
    This paper introduces a new, semi-parametric model for circular data, based on mixtures of shifted, scaled, beta (SSB) densities. This model is more general than the Bernstein polynomial density model which is well known to provide good approximations to any density with finite support and it is shown that, as for the Bernstein polynomial model, the trigonometric moments of the SSB mixture model can all be derived. Two methods of fitting the SSB mixture model are considered. Firstly, a classical, maximum likelihood approach for fitting mixtures of a given number of SSB components is introduced. The Bayesian information criterion is then used for model selection. Secondly, a Bayesian approach using Gibbs sampling is considered. In this case, the number of mixture components is selected via an appropriate deviance information criterion. Both approaches are illustrated with real data sets and the results are compared with those obtained using Bernstein polynomials and mixtures of von Mises distributions

    A semi-parametric model for circular data based on mixtures of beta distributions

    Get PDF
    This paper introduces a new, semi-parametric model for circular data, based on mixtures of shifted, scaled, beta (SSB) densities. This model is more general than the Bernstein polynomial density model which is well known to provide good approximations to any density with finite support and it is shown that, as for the Bernstein polynomial model, the trigonometric moments of the SSB mixture model can all be derived. Two methods of fitting the SSB mixture model are considered. Firstly, a classical, maximum likelihood approach for fitting mixtures of a given number of SSB components is introduced. The Bayesian information criterion is then used for model selection. Secondly, a Bayesian approach using Gibbs sampling is considered. In this case, the number of mixture components is selected via an appropriate deviance information criterion. Both approaches are illustrated with real data sets and the results are compared with those obtained using Bernstein polynomials and mixtures of von Mises distributions.Circular data, Shifted, scaled, beta distribution; Mixture models, Bernstein polynomials

    How Many Subpopulations is Too Many? Exponential Lower Bounds for Inferring Population Histories

    Full text link
    Reconstruction of population histories is a central problem in population genetics. Existing coalescent-based methods, like the seminal work of Li and Durbin (Nature, 2011), attempt to solve this problem using sequence data but have no rigorous guarantees. Determining the amount of data needed to correctly reconstruct population histories is a major challenge. Using a variety of tools from information theory, the theory of extremal polynomials, and approximation theory, we prove new sharp information-theoretic lower bounds on the problem of reconstructing population structure -- the history of multiple subpopulations that merge, split and change sizes over time. Our lower bounds are exponential in the number of subpopulations, even when reconstructing recent histories. We demonstrate the sharpness of our lower bounds by providing algorithms for distinguishing and learning population histories with matching dependence on the number of subpopulations. Along the way and of independent interest, we essentially determine the optimal number of samples needed to learn an exponential mixture distribution information-theoretically, proving the upper bound by analyzing natural (and efficient) algorithms for this problem.Comment: 38 pages, Appeared in RECOMB 201

    Nonparametric estimation of the mixing density using polynomials

    Full text link
    We consider the problem of estimating the mixing density ff from nn i.i.d. observations distributed according to a mixture density with unknown mixing distribution. In contrast with finite mixtures models, here the distribution of the hidden variable is not bounded to a finite set but is spread out over a given interval. We propose an approach to construct an orthogonal series estimator of the mixing density ff involving Legendre polynomials. The construction of the orthonormal sequence varies from one mixture model to another. Minimax upper and lower bounds of the mean integrated squared error are provided which apply in various contexts. In the specific case of exponential mixtures, it is shown that the estimator is adaptive over a collection of specific smoothness classes, more precisely, there exists a constant A\textgreater{}0 such that, when the order mm of the projection estimator verifies mAlog(n)m\sim A \log(n), the estimator achieves the minimax rate over this collection. Other cases are investigated such as Gamma shape mixtures and scale mixtures of compactly supported densities including Beta mixtures. Finally, a consistent estimator of the support of the mixing density ff is provided

    Learning the Number of Autoregressive Mixtures in Time Series Using the Gap Statistics

    Full text link
    Using a proper model to characterize a time series is crucial in making accurate predictions. In this work we use time-varying autoregressive process (TVAR) to describe non-stationary time series and model it as a mixture of multiple stable autoregressive (AR) processes. We introduce a new model selection technique based on Gap statistics to learn the appropriate number of AR filters needed to model a time series. We define a new distance measure between stable AR filters and draw a reference curve that is used to measure how much adding a new AR filter improves the performance of the model, and then choose the number of AR filters that has the maximum gap with the reference curve. To that end, we propose a new method in order to generate uniform random stable AR filters in root domain. Numerical results are provided demonstrating the performance of the proposed approach.Comment: This paper has been accepted by 2015 IEEE International Conference on Data Minin

    A Method of Moments for Mixture Models and Hidden Markov Models

    Full text link
    Mixture models are a fundamental tool in applied statistics and machine learning for treating data taken from multiple subpopulations. The current practice for estimating the parameters of such models relies on local search heuristics (e.g., the EM algorithm) which are prone to failure, and existing consistent methods are unfavorable due to their high computational and sample complexity which typically scale exponentially with the number of mixture components. This work develops an efficient method of moments approach to parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians (such as mixtures of axis-aligned Gaussians) and hidden Markov models. The new method leads to rigorous unsupervised learning results for mixture models that were not achieved by previous works; and, because of its simplicity, it offers a viable alternative to EM for practical deployment
    corecore