Search CORE

14,115 research outputs found

Sparse Linear Identifiable Multivariate Modeling

Author: Aapo Hyvärinen
Dtu Informatics
Ole Winther
Ricardo Henao
Richard Petersens Plads
Publication venue
Publication date: 01/01/2011
Field of study

In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully Bayesian hierarchy for sparse models using slab and spike priors (two-component delta-function and continuous mixtures), non-Gaussian latent factors and a stochastic search over the ordering of the variables. The framework, which we call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable computational complexity. We attribute this mainly to the stochastic search strategy used, and to parsimony (sparsity and identifiability), which is an explicit part of the model. We propose two extensions to the basic i.i.d. linear framework: non-linear dependence on observed variables, called SNIM (Sparse Non-linear Identifiable Multivariate modeling) and allowing for correlations between latent variables, called CSLIM (Correlated SLIM), for the temporal and/or spatial data. The source code and scripts are available from http://cogsys.imm.dtu.dk/slim/.Comment: 45 pages, 17 figure

arXiv.org e-Print Archive

CiteSeerX

Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors

Author: Atger
Benjamin Renard
Beven
Beven
Beven
Bras
Bulygina
Di Baldassarre
Dmitri Kavetski
Dottori
Duan
Eberly
Fenicia
Fenicia
Feyen
Franks
Gelfand
Gelman
George Kuczera
Goldstein
Haario
Hadamard
Hall
Huard
Jacquin
Kavetski
Kavetski
Kavetski
Kavetski
Kennedy
Krzysztofowicz
Kuczera
Kuczera
Kuczera
Laio
Mantovan
Mark Thyer
Marshall
Montanari
Montanari
Moradkhani
Moyeed
Neppel
Oudin
Perrin
Refsgaard
Reichert
Reitan
Renard
Severino
Spiegelhalter
Stedinger
Stewart W. Franks
Tarantola
Thiemann
Thyer
Tonkin
Wagener
Yang
Young
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 01/01/2009
Field of study

Meaningful quantification of data and structural uncertainties in conceptual rainfall-runoff modeling is a major scientific and engineering challenge. This paper focuses on the total predictive uncertainty and its decomposition into input and structural components under different inference scenarios. Several Bayesian inference schemes are investigated, differing in the treatment of rainfall and structural uncertainties, and in the precision of the priors describing rainfall uncertainty. Compared with traditional lumped additive error approaches, the quantification of the total predictive uncertainty in the runoff is improved when rainfall and/or structural errors are characterized explicitly. However, the decomposition of the total uncertainty into individual sources is more challenging. In particular, poor identifiability may arise when the inference scheme represents rainfall and structural errors using separate probabilistic models. The inference becomes ill‐posed unless sufficiently precise prior knowledge of data uncertainty is supplied; this ill‐posedness can often be detected from the behavior of the Monte Carlo sampling algorithm. Moreover, the priors on the data quality must also be sufficiently accurate if the inference is to be reliable and support meaningful uncertainty decomposition. Our findings highlight the inherent limitations of inferring inaccurate hydrologic models using rainfall‐runoff data with large unknown errors. Bayesian total error analysis can overcome these problems using independent prior information. The need for deriving independent descriptions of the uncertainties in the input and output data is clearly demonstrated.Benjamin Renard, Dmitri Kavetski, George Kuczera, Mark Thyer, and Stewart W. Frank

The correlation space of Gaussian latent tree models and model selection without fitting

Author: Aston John A. D.
Shiers Nathaniel
Smith Jim Q.
Zwiernik Piotr
Publication venue
Publication date: 11/04/2016
Field of study

We provide a complete description of possible covariance matrices consistent with a Gaussian latent tree model for any tree. We then present techniques for utilising these constraints to assess whether observed data is compatible with that Gaussian latent tree model. Our method does not require us first to fit such a tree. We demonstrate the usefulness of the inverse-Wishart distribution for performing preliminary assessments of tree-compatibility using semialgebraic constraints. Using results from Drton et al. (2008) we then provide the appropriate moments required for test statistics for assessing adherence to these equality constraints. These are shown to be effective even for small sample sizes and can be easily adjusted to test either the entire model or only certain macrostructures hypothesized within the tree. We illustrate our exploratory tetrad analysis using a linguistic application and our confirmatory tetrad analysis using a biological application.Comment: 15 page

arXiv.org e-Print Archive

Warwick Research Archives Portal Repository