Search CORE

22,138 research outputs found

Improving Inference of Gaussian Mixtures Using Auxiliary Variables

Author: Andrea Mercatanti
Angrist
Basford
Boldea
Campbell
Dempster
DeSouza
Dietz
Fabrizia Mealli
Fan Li
Freedman
Hawkins
Huber
Ichino
Imbens
Louis
Marin
Mattei
McLachlan
McLachlan
McLachlan
McLachlan
Mealli
Mercatanti
Newton
Richardson
Ripley
West
West
Publication venue
Publication date: 07/11/2014
Field of study

Expanding a lower-dimensional problem to a higher-dimensional space and then projecting back is often beneficial. This article rigorously investigates this perspective in the context of finite mixture models, namely how to improve inference for mixture models by using auxiliary variables. Despite the large literature in mixture models and several empirical examples, there is no previous work that gives general theoretical justification for including auxiliary variables in mixture models, even for special cases. We provide a theoretical basis for comparing inference for mixture multivariate models with the corresponding inference for marginal univariate mixture models. Analytical results for several special cases are established. We show that the probability of correctly allocating mixture memberships and the information number for the means of the primary outcome in a bivariate model with two Gaussian mixtures are generally larger than those in each univariate model. Simulations under a range of scenarios, including misspecified models, are conducted to examine the improvement. The method is illustrated by two real applications in ecology and causal inference

arXiv.org e-Print Archive

Crossref

Catalogo dei prodotti della ricerca

Archivio della ricerca- Università di Roma La Sapienza

Constrained probability distributions of correlation functions

Author: Blinnikov
D. Keitel
Fu
Hammarwall
Hartlap
Okumura
P. Schneider
Sato
Schneider
Seljak
Publication venue: 'EDP Sciences'
Publication date: 01/01/2011
Field of study

Context: Two-point correlation functions are used throughout cosmology as a measure for the statistics of random fields. When used in Bayesian parameter estimation, their likelihood function is usually replaced by a Gaussian approximation. However, this has been shown to be insufficient. Aims: For the case of Gaussian random fields, we search for an exact probability distribution of correlation functions, which could improve the accuracy of future data analyses. Methods: We use a fully analytic approach, first expanding the random field in its Fourier modes, and then calculating the characteristic function. Finally, we derive the probability distribution function using integration by residues. We use a numerical implementation of the full analytic formula to discuss the behaviour of this function. Results: We derive the univariate and bivariate probability distribution function of the correlation functions of a Gaussian random field, and outline how higher joint distributions could be calculated. We give the results in the form of mode expansions, but in one special case we also find a closed-form expression. We calculate the moments of the distribution and, in the univariate case, we discuss the Edgeworth expansion approximation. We also comment on the difficulties in a fast and exact numerical implementation of our results, and on possible future applications.Comment: 13 pages, 5 figures, updated to match version published in A&A (slightly expanded Sects. 5.3 and 6

arXiv.org e-Print Archive

CiteSeerX

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

MPG.PuRe

Pair-copula constructions of multiple dependence

Author: Aas Kjersti
Bakken Henrik
Czado Claudia
Frigessi Arnoldo
Publication venue
Publication date: 01/01/2006
Field of study

Building on the work of Bedford, Cooke and Joe, we show how multivariate data, which exhibit complex patterns of dependence in the tails, can be modelled using a cascade of pair-copulae, acting on two variables at a time. We use the pair-copula decomposition of a general multivariate distribution and propose a method to perform inference. The model construction is hierarchical in nature, the various levels corresponding to the incorporation of more variables in the conditioning sets, using pair-copulae as simple building blocs. Pair-copula decomposed models also represent a very flexible way to construct higher-dimensional coplulae. We apply the methodology to a financial data set. Our approach represents the first step towards developing of an unsupervised algorithm that explores the space of possible pair-copula models, that also can be applied to huge data sets automatically

Open Access LMU

Non-Gaussian Geostatistical Modeling using (skew) t Processes

Author: Bevilacqua M.
Caamaño C.
Morales-Onñate V.
Valle R. B. Arellano
Publication venue
Publication date: 19/12/2019
Field of study

We propose a new model for regression and dependence analysis when addressing spatial data with possibly heavy tails and an asymmetric marginal distribution. We first propose a stationary process with

t

marginals obtained through scale mixing of a Gaussian process with an inverse square root process with Gamma marginals. We then generalize this construction by considering a skew-Gaussian process, thus obtaining a process with skew-t marginal distributions. For the proposed (skew)

t

process we study the second-order and geometrical properties and in the

t

case, we provide analytic expressions for the bivariate distribution. In an extensive simulation study, we investigate the use of the weighted pairwise likelihood as a method of estimation for the

t

process. Moreover we compare the performance of the optimal linear predictor of the

t

process versus the optimal Gaussian predictor. Finally, the effectiveness of our methodology is illustrated by analyzing a georeferenced dataset on maximum temperatures in Australi

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Multivariate type G Mat\'ern stochastic partial differential equation random fields

Author: Bolin David
Wallin Jonas
Publication venue: 'Wiley'
Publication date: 31/12/2019
Field of study

For many applications with multivariate data, random field models capturing departures from Gaussianity within realisations are appropriate. For this reason, we formulate a new class of multivariate non-Gaussian models based on systems of stochastic partial differential equations with additive type G noise whose marginal covariance functions are of Mat\'ern type. We consider four increasingly flexible constructions of the noise, where the first two are similar to existing copula-based models. In contrast to these, the latter two constructions can model non-Gaussian spatial data without replicates. Computationally efficient methods for likelihood-based parameter estimation and probabilistic prediction are proposed, and the flexibility of the suggested models is illustrated by numerical examples and two statistical applications

arXiv.org e-Print Archive

Lund University Publications

Detecting spatial patterns with the cumulant function. Part I: The theory

Author: Bernacchia Alberto
Naveau Philippe
Publication venue: 'Copernicus GmbH'
Publication date: 04/07/2007
Field of study

In climate studies, detecting spatial patterns that largely deviate from the sample mean still remains a statistical challenge. Although a Principal Component Analysis (PCA), or equivalently a Empirical Orthogonal Functions (EOF) decomposition, is often applied on this purpose, it can only provide meaningful results if the underlying multivariate distribution is Gaussian. Indeed, PCA is based on optimizing second order moments quantities and the covariance matrix can only capture the full dependence structure for multivariate Gaussian vectors. Whenever the application at hand can not satisfy this normality hypothesis (e.g. precipitation data), alternatives and/or improvements to PCA have to be developed and studied. To go beyond this second order statistics constraint that limits the applicability of the PCA, we take advantage of the cumulant function that can produce higher order moments information. This cumulant function, well-known in the statistical literature, allows us to propose a new, simple and fast procedure to identify spatial patterns for non-Gaussian data. Our algorithm consists in maximizing the cumulant function. To illustrate our approach, its implementation for which explicit computations are obtained is performed on three family of of multivariate random vectors. In addition, we show that our algorithm corresponds to selecting the directions along which projected data display the largest spread over the marginal probability density tails.Comment: 9 pages, 3 figure

arXiv.org e-Print Archive