546 research outputs found

    Hidden Markov Model Identifiability via Tensors

    Full text link
    The prevalence of hidden Markov models (HMMs) in various applications of statistical signal processing and communications is a testament to the power and flexibility of the model. In this paper, we link the identifiability problem with tensor decomposition, in particular, the Canonical Polyadic decomposition. Using recent results in deriving uniqueness conditions for tensor decomposition, we are able to provide a necessary and sufficient condition for the identification of the parameters of discrete time finite alphabet HMMs. This result resolves a long standing open problem regarding the derivation of a necessary and sufficient condition for uniquely identifying an HMM. We then further extend recent preliminary work on the identification of HMMs with multiple observers by deriving necessary and sufficient conditions for identifiability in this setting.Comment: Accepted to ISIT 2013. 5 pages, no figure

    Quantifying causal influences

    Full text link
    Many methods for causal inference generate directed acyclic graphs (DAGs) that formalize causal relations between nn variables. Given the joint distribution on all these variables, the DAG contains all information about how intervening on one variable changes the distribution of the other n1n-1 variables. However, quantifying the causal influence of one variable on another one remains a nontrivial question. Here we propose a set of natural, intuitive postulates that a measure of causal strength should satisfy. We then introduce a communication scenario, where edges in a DAG play the role of channels that can be locally corrupted by interventions. Causal strength is then the relative entropy distance between the old and the new distribution. Many other measures of causal strength have been proposed, including average causal effect, transfer entropy, directed information, and information flow. We explain how they fail to satisfy the postulates on simple DAGs of 3\leq3 nodes. Finally, we investigate the behavior of our measure on time-series, supporting our claims with experiments on simulated data.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1145 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Causal Discovery and Prediction: Methods and Algorithms

    Full text link
    We are not only observers but also actors of reality. Our capability to intervene and alter the course of some events in the space and time surrounding us is an essential component of how we build our model of the world. In this doctoral thesis we introduce a generic a-priori assessment of each possible intervention, in order to select the most cost-effective interventions only, and avoid unnecessary systematic experimentation on the real world. Based on this a-priori assessment, we propose an active learning algorithm that identifies the causal relations in any given causal model, using a least cost sequence of interventions. There are several novel aspects introduced by our algorithm. It is, in most case scenarios, able to discard many causal model candidates using relatively inexpensive interventions that only test one value of the intervened variables. Also, the number of interventions performed by the algorithm can be bounded by the number of causal model candidates. Hence, fewer initial candidates (or equivalently, more prior knowledge) lead to fewer interventions for causal discovery. Causality is intimately related to time, as causes appear to precede their effects. Cyclical causal processes are a very interesting case of causality in relation to time. In this doctoral thesis we introduce a formal analysis of time cyclical causal settings by defining a causal analog to the purely observational Dynamic Bayesian Networks, and provide a sound and complete algorithm for the identification of causal effects in the cyclic setting. We introduce the existence of two types of hidden confounder variables in this framework, which affect in substantially different ways the identification procedures, a distinction with no analog in either Dynamic Bayesian Networks or standard causal graphs.Comment: PhD Thesis, 101 pages. arXiv admin note: text overlap with arXiv:1610.0555

    L.I.E.S. of omission: complex observation processes in ecology

    Get PDF
    Advances in statistics mean that it is now possible to tackle increasingly sophisticated observation processes. The intricacies and ambitious scale of modern data collection techniques mean that this is now essential. Methodological research to make inference about the biological process while accounting for the observation process has expanded dramatically, but solutions are often presented in field-specific terms, limiting our ability to identify commonalities between methods. We suggest a typology of observation processes that could improve translation between fields and aid methodological synthesis. We propose the LIES framework (defining observation processes in terms of issues of Latency, Identifiability, Effort and Scale) and illustrate its use with both simple examples and more complex case studies

    Hyper-Spectral Image Analysis with Partially-Latent Regression and Spatial Markov Dependencies

    Get PDF
    Hyper-spectral data can be analyzed to recover physical properties at large planetary scales. This involves resolving inverse problems which can be addressed within machine learning, with the advantage that, once a relationship between physical parameters and spectra has been established in a data-driven fashion, the learned relationship can be used to estimate physical parameters for new hyper-spectral observations. Within this framework, we propose a spatially-constrained and partially-latent regression method which maps high-dimensional inputs (hyper-spectral images) onto low-dimensional responses (physical parameters such as the local chemical composition of the soil). The proposed regression model comprises two key features. Firstly, it combines a Gaussian mixture of locally-linear mappings (GLLiM) with a partially-latent response model. While the former makes high-dimensional regression tractable, the latter enables to deal with physical parameters that cannot be observed or, more generally, with data contaminated by experimental artifacts that cannot be explained with noise models. Secondly, spatial constraints are introduced in the model through a Markov random field (MRF) prior which provides a spatial structure to the Gaussian-mixture hidden variables. Experiments conducted on a database composed of remotely sensed observations collected from the Mars planet by the Mars Express orbiter demonstrate the effectiveness of the proposed model.Comment: 12 pages, 4 figures, 3 table
    corecore