Search CORE

5,289 research outputs found

Dynamic density estimation with diffusive Dirichlet mixtures

Author: Mena Ramsés H.
Ruggiero Matteo
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 01/01/2016
Field of study

We introduce a new class of nonparametric prior distributions on the space of continuously varying densities, induced by Dirichlet process mixtures which diffuse in time. These select time-indexed random functions without jumps, whose sections are continuous or discrete distributions depending on the choice of kernel. The construction exploits the widely used stick-breaking representation of the Dirichlet process and induces the time dependence by replacing the stick-breaking components with one-dimensional Wright-Fisher diffusions. These features combine appealing properties of the model, inherited from the Wright-Fisher diffusions and the Dirichlet mixture structure, with great flexibility and tractability for posterior computation. The construction can be easily extended to multi-parameter GEM marginal states, which include, for example, the Pitman--Yor process. A full inferential strategy is detailed and illustrated on simulated and real data.Comment: Published at http://dx.doi.org/10.3150/14-BEJ681 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

Institutional Research Information System University of Turin

Recent advances in directional statistics

Author: García-Portugués Eduardo
Pewsey Arthur
Publication venue
Publication date: 22/09/2020
Field of study

Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

arXiv.org e-Print Archive

Crossref

Universidad Carlos III de Madrid e-Archivo

Mixed Marginal Copula Modeling

Author: Gunawan David
Khaled Mohamad A.
Kohn Robert
Publication venue
Publication date: 04/09/2017
Field of study

This article extends the literature on copulas with discrete or continuous marginals to the case where some of the marginals are a mixture of discrete and continuous components. We do so by carefully defining the likelihood as the density of the observations with respect to a mixed measure. The treatment is quite general, although we focus focus on mixtures of Gaussian and Archimedean copulas. The inference is Bayesian with the estimation carried out by Markov chain Monte Carlo. We illustrate the methodology and algorithms by applying them to estimate a multivariate income dynamics model.Comment: 46 pages, 8 tables and 4 figure

arXiv.org e-Print Archive

University of Queensland eSpace

FigShare

Identifiability of parameters in latent structure models with many observed variables

Author: Allman Elizabeth S.
Matias Catherine
Rhodes John A.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

While hidden class models of various types arise in many statistical applications, it is often difficult to establish the identifiability of their parameters. Focusing on models in which there is some structure of independence of some of the observed variables conditioned on hidden ones, we demonstrate a general approach for establishing identifiability utilizing algebraic arguments. A theorem of J. Kruskal for a simple latent-class model with finite state space lies at the core of our results, though we apply it to a diverse set of models. These include mixtures of both finite and nonparametric product distributions, hidden Markov models and random graph mixture models, and lead to a number of new results and improvements to old ones. In the parametric setting, this approach indicates that for such models, the classical definition of identifiability is typically too strong. Instead generic identifiability holds, which implies that the set of nonidentifiable parameters has measure zero, so that parameter inference is still meaningful. In particular, this sheds light on the properties of finite mixtures of Bernoulli products, which have been used for decades despite being known to have nonidentifiable parameters. In the nonparametric setting, we again obtain identifiability only when certain restrictions are placed on the distributions that are mixed, but we explicitly describe the conditions.Comment: Published in at http://dx.doi.org/10.1214/09-AOS689 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Kernel Belief Propagation

Author: Bickson Danny
Gretton Arthur
Guestrin Carlos
Low Yucheng
Song Le
Publication venue
Publication date: 01/01/2011
Field of study

We propose a nonparametric generalization of belief propagation, Kernel Belief Propagation (KBP), for pairwise Markov random fields. Messages are represented as functions in a reproducing kernel Hilbert space (RKHS), and message updates are simple linear operations in the RKHS. KBP makes none of the assumptions commonly required in classical BP algorithms: the variables need not arise from a finite domain or a Gaussian distribution, nor must their relations take any particular parametric form. Rather, the relations between variables are represented implicitly, and are learned nonparametrically from training data. KBP has the advantage that it may be used on any domain where kernels are defined (Rd, strings, groups), even where explicit parametric models are not known, or closed form expressions for the BP updates do not exist. The computational cost of message updates in KBP is polynomial in the training data size. We also propose a constant time approximate message update procedure by representing messages using a small number of basis functions. In experiments, we apply KBP to image denoising, depth prediction from still images, and protein configuration prediction: KBP is faster than competing classical and nonparametric approaches (by orders of magnitude, in some cases), while providing significantly more accurate results

arXiv.org e-Print Archive

CiteSeerX

Multivariate type G Mat\'ern stochastic partial differential equation random fields

Author: Bolin David
Wallin Jonas
Publication venue: 'Wiley'
Publication date: 31/12/2019
Field of study

For many applications with multivariate data, random field models capturing departures from Gaussianity within realisations are appropriate. For this reason, we formulate a new class of multivariate non-Gaussian models based on systems of stochastic partial differential equations with additive type G noise whose marginal covariance functions are of Mat\'ern type. We consider four increasingly flexible constructions of the noise, where the first two are similar to existing copula-based models. In contrast to these, the latter two constructions can model non-Gaussian spatial data without replicates. Computationally efficient methods for likelihood-based parameter estimation and probabilistic prediction are proposed, and the flexibility of the suggested models is illustrated by numerical examples and two statistical applications

arXiv.org e-Print Archive

Lund University Publications