546 research outputs found
Hidden Markov Model Identifiability via Tensors
The prevalence of hidden Markov models (HMMs) in various applications of
statistical signal processing and communications is a testament to the power
and flexibility of the model. In this paper, we link the identifiability
problem with tensor decomposition, in particular, the Canonical Polyadic
decomposition. Using recent results in deriving uniqueness conditions for
tensor decomposition, we are able to provide a necessary and sufficient
condition for the identification of the parameters of discrete time finite
alphabet HMMs. This result resolves a long standing open problem regarding the
derivation of a necessary and sufficient condition for uniquely identifying an
HMM. We then further extend recent preliminary work on the identification of
HMMs with multiple observers by deriving necessary and sufficient conditions
for identifiability in this setting.Comment: Accepted to ISIT 2013. 5 pages, no figure
Quantifying causal influences
Many methods for causal inference generate directed acyclic graphs (DAGs)
that formalize causal relations between variables. Given the joint
distribution on all these variables, the DAG contains all information about how
intervening on one variable changes the distribution of the other
variables. However, quantifying the causal influence of one variable on another
one remains a nontrivial question. Here we propose a set of natural, intuitive
postulates that a measure of causal strength should satisfy. We then introduce
a communication scenario, where edges in a DAG play the role of channels that
can be locally corrupted by interventions. Causal strength is then the relative
entropy distance between the old and the new distribution. Many other measures
of causal strength have been proposed, including average causal effect,
transfer entropy, directed information, and information flow. We explain how
they fail to satisfy the postulates on simple DAGs of nodes. Finally,
we investigate the behavior of our measure on time-series, supporting our
claims with experiments on simulated data.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1145 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Causal Discovery and Prediction: Methods and Algorithms
We are not only observers but also actors of reality. Our capability to
intervene and alter the course of some events in the space and time surrounding
us is an essential component of how we build our model of the world. In this
doctoral thesis we introduce a generic a-priori assessment of each possible
intervention, in order to select the most cost-effective interventions only,
and avoid unnecessary systematic experimentation on the real world. Based on
this a-priori assessment, we propose an active learning algorithm that
identifies the causal relations in any given causal model, using a least cost
sequence of interventions. There are several novel aspects introduced by our
algorithm. It is, in most case scenarios, able to discard many causal model
candidates using relatively inexpensive interventions that only test one value
of the intervened variables. Also, the number of interventions performed by the
algorithm can be bounded by the number of causal model candidates. Hence, fewer
initial candidates (or equivalently, more prior knowledge) lead to fewer
interventions for causal discovery.
Causality is intimately related to time, as causes appear to precede their
effects. Cyclical causal processes are a very interesting case of causality in
relation to time. In this doctoral thesis we introduce a formal analysis of
time cyclical causal settings by defining a causal analog to the purely
observational Dynamic Bayesian Networks, and provide a sound and complete
algorithm for the identification of causal effects in the cyclic setting. We
introduce the existence of two types of hidden confounder variables in this
framework, which affect in substantially different ways the identification
procedures, a distinction with no analog in either Dynamic Bayesian Networks or
standard causal graphs.Comment: PhD Thesis, 101 pages. arXiv admin note: text overlap with
arXiv:1610.0555
L.I.E.S. of omission: complex observation processes in ecology
Advances in statistics mean that it is now possible to tackle increasingly sophisticated observation processes. The intricacies and ambitious scale of modern data collection techniques mean that this is now essential. Methodological research to make inference about the biological process while accounting for the observation process has expanded dramatically, but solutions are often presented in field-specific terms, limiting our ability to identify commonalities between methods. We suggest a typology of observation processes that could improve translation between fields and aid methodological synthesis. We propose the LIES framework (defining observation processes in terms of issues of Latency, Identifiability, Effort and Scale) and illustrate its use with both simple examples and more complex case studies
Hyper-Spectral Image Analysis with Partially-Latent Regression and Spatial Markov Dependencies
Hyper-spectral data can be analyzed to recover physical properties at large
planetary scales. This involves resolving inverse problems which can be
addressed within machine learning, with the advantage that, once a relationship
between physical parameters and spectra has been established in a data-driven
fashion, the learned relationship can be used to estimate physical parameters
for new hyper-spectral observations. Within this framework, we propose a
spatially-constrained and partially-latent regression method which maps
high-dimensional inputs (hyper-spectral images) onto low-dimensional responses
(physical parameters such as the local chemical composition of the soil). The
proposed regression model comprises two key features. Firstly, it combines a
Gaussian mixture of locally-linear mappings (GLLiM) with a partially-latent
response model. While the former makes high-dimensional regression tractable,
the latter enables to deal with physical parameters that cannot be observed or,
more generally, with data contaminated by experimental artifacts that cannot be
explained with noise models. Secondly, spatial constraints are introduced in
the model through a Markov random field (MRF) prior which provides a spatial
structure to the Gaussian-mixture hidden variables. Experiments conducted on a
database composed of remotely sensed observations collected from the Mars
planet by the Mars Express orbiter demonstrate the effectiveness of the
proposed model.Comment: 12 pages, 4 figures, 3 table
- …