257,404 research outputs found
Expectation maximization and latent class models
Thesis (M.S.) University of Alaska Fairbanks, 2016Latent tree models are tree structured graphical models where some random variables are observable while others are latent. These models are used to model data in many areas, such as bioinformatics, phylogenetics, computer vision among others. This work contains some background on latent tree models and algebraic geometry with the goal of estimating the volume of the latent tree model known as the 3-leaf model M₂ (where the root is a hidden variable with 2 states, and is the parent of three observable variables with 2 states) in the probability simplex Δ₇, and to estimate the volume of the latent tree model known as the 3-leaf model M₃ (where the root is a hidden variable with 3 states, and is the parent of two observable variables with 3 states and one observable variable with 2 states) in the probability simplex Δ₁₇. For the model M₃, we estimate that the rough percentage of distributions that arise from stochastic parameters is 0:015%, the rough percentage of distributions that arise from real parameters is 64:742% and the rough percentage of distributions that arise from complex parameters is 35:206%. We will also discuss the algebraic boundary of these models and we observe the behavior of the estimates of the Expectation Maximization algorithm (EM algorithm), an iterative method typically used to try to find a maximum likelihood estimator.Chapter 1: Introduction -- 1.1 Chapter Overview -- Chapter 2: Basic Concepts -- 2.1 A statistical model as a geometric object -- 2.2 Varieties -- 2.2.1 Zariski closure -- 2.2.2 Semialgebraic sets -- 2.3 Tensors -- 2.4 Latent tree models -- 2.4.1 Basic graph theory -- 2.4.2 Latent tree model -- Chapter 3: The 3-leaf model -- 3.0.3 Parameter identiability -- 3.1 Volume of the model in the probability simplex -- 3.1.1 The volume of M₂ in Δ₇ -- 3.1.2 The volume of M₃ in Δ₁₇ -- Chapter 4: The algebraic boundary of M -- 4.1 Algebraic boundary -- 4.2 Boundary strata of M₂ -- 4.3 Boundary strata of M₃ -- Chapter 5: The EM algorithm -- 5.1 Maximum likelihood estimator -- 5.2 The EM algorithm -- 5.3 E-step -- 5.4 M-step -- 5.5 EM estimates -- Chapter 6: Conclusions -- Appendix -- References
Algorithms of causal inference for the analysis of effective connectivity among brain regions
In recent years, powerful general algorithms of causal inference have been developed. In particular, in the framework of Pearl’s causality, algorithms of inductive causation (IC and IC*) provide a procedure to determine which causal connections among nodes in a network can be inferred from empirical observations even in the presence of latent variables, indicating the limits of what can be learned without active manipulation of the system. These algorithms can in principle become important complements to established techniques such as Granger causality and Dynamic Causal Modeling (DCM) to analyze causal influences (effective connectivity) among brain regions. However, their application to dynamic processes has not been yet examined. Here we study how to apply these algorithms to time-varying signals such as electrophysiological or neuroimaging signals. We propose a new algorithm which combines the basic principles of the previous algorithms with Granger causality to obtain a representation of the causal relations suited to dynamic processes. Furthermore, we use graphical criteria to predict dynamic statistical dependencies between the signals from the causal structure. We show how some problems for causal inference from neural signals (e.g., measurement noise, hemodynamic responses, and time aggregation) can be understood in a general graphical approach. Focusing on the effect of spatial aggregation, we show that when causal inference is performed at a coarser scale than the one at which the neural sources interact, results strongly depend on the degree of integration of the neural sources aggregated in the signals, and thus characterize more the intra-areal properties than the interactions among regions. We finally discuss how the explicit consideration of latent processes contributes to understand Granger causality and DCM as well as to distinguish functional and effective connectivity
The Lazy Flipper: MAP Inference in Higher-Order Graphical Models by Depth-limited Exhaustive Search
This article presents a new search algorithm for the NP-hard problem of
optimizing functions of binary variables that decompose according to a
graphical model. It can be applied to models of any order and structure. The
main novelty is a technique to constrain the search space based on the topology
of the model. When pursued to the full search depth, the algorithm is
guaranteed to converge to a global optimum, passing through a series of
monotonously improving local optima that are guaranteed to be optimal within a
given and increasing Hamming distance. For a search depth of 1, it specializes
to Iterated Conditional Modes. Between these extremes, a useful tradeoff
between approximation quality and runtime is established. Experiments on models
derived from both illustrative and real problems show that approximations found
with limited search depth match or improve those obtained by state-of-the-art
methods based on message passing and linear programming.Comment: C++ Source Code available from
http://hci.iwr.uni-heidelberg.de/software.ph
Tropical Geometry of Statistical Models
This paper presents a unified mathematical framework for inference in
graphical models, building on the observation that graphical models are
algebraic varieties.
From this geometric viewpoint, observations generated from a model are
coordinates of a point in the variety, and the sum-product algorithm is an
efficient tool for evaluating specific coordinates. The question addressed here
is how the solutions to various inference problems depend on the model
parameters. The proposed answer is expressed in terms of tropical algebraic
geometry. A key role is played by the Newton polytope of a statistical model.
Our results are applied to the hidden Markov model and to the general Markov
model on a binary tree.Comment: 14 pages, 3 figures. Major revision. Applications now in companion
paper, "Parametric Inference for Biological Sequence Analysis
Blending Learning and Inference in Structured Prediction
In this paper we derive an efficient algorithm to learn the parameters of
structured predictors in general graphical models. This algorithm blends the
learning and inference tasks, which results in a significant speedup over
traditional approaches, such as conditional random fields and structured
support vector machines. For this purpose we utilize the structures of the
predictors to describe a low dimensional structured prediction task which
encourages local consistencies within the different structures while learning
the parameters of the model. Convexity of the learning task provides the means
to enforce the consistencies between the different parts. The
inference-learning blending algorithm that we propose is guaranteed to converge
to the optimum of the low dimensional primal and dual programs. Unlike many of
the existing approaches, the inference-learning blending allows us to learn
efficiently high-order graphical models, over regions of any size, and very
large number of parameters. We demonstrate the effectiveness of our approach,
while presenting state-of-the-art results in stereo estimation, semantic
segmentation, shape reconstruction, and indoor scene understanding
Sparse Median Graphs Estimation in a High Dimensional Semiparametric Model
In this manuscript a unified framework for conducting inference on complex
aggregated data in high dimensional settings is proposed. The data are assumed
to be a collection of multiple non-Gaussian realizations with underlying
undirected graphical structures. Utilizing the concept of median graphs in
summarizing the commonality across these graphical structures, a novel
semiparametric approach to modeling such complex aggregated data is provided
along with robust estimation of the median graph, which is assumed to be
sparse. The estimator is proved to be consistent in graph recovery and an upper
bound on the rate of convergence is given. Experiments on both synthetic and
real datasets are conducted to illustrate the empirical usefulness of the
proposed models and methods
Foundational principles for large scale inference: Illustrations through correlation mining
When can reliable inference be drawn in the "Big Data" context? This paper
presents a framework for answering this fundamental question in the context of
correlation mining, with implications for general large scale inference. In
large scale data applications like genomics, connectomics, and eco-informatics
the dataset is often variable-rich but sample-starved: a regime where the
number of acquired samples (statistical replicates) is far fewer than the
number of observed variables (genes, neurons, voxels, or chemical
constituents). Much of recent work has focused on understanding the
computational complexity of proposed methods for "Big Data." Sample complexity
however has received relatively less attention, especially in the setting when
the sample size is fixed, and the dimension grows without bound. To
address this gap, we develop a unified statistical framework that explicitly
quantifies the sample complexity of various inferential tasks. Sampling regimes
can be divided into several categories: 1) the classical asymptotic regime
where the variable dimension is fixed and the sample size goes to infinity; 2)
the mixed asymptotic regime where both variable dimension and sample size go to
infinity at comparable rates; 3) the purely high dimensional asymptotic regime
where the variable dimension goes to infinity and the sample size is fixed.
Each regime has its niche but only the latter regime applies to exa-scale data
dimension. We illustrate this high dimensional framework for the problem of
correlation mining, where it is the matrix of pairwise and partial correlations
among the variables that are of interest. We demonstrate various regimes of
correlation mining based on the unifying perspective of high dimensional
learning rates and sample complexity for different structured covariance models
and different inference tasks
Collaborative Verification-Driven Engineering of Hybrid Systems
Hybrid systems with both discrete and continuous dynamics are an important
model for real-world cyber-physical systems. The key challenge is to ensure
their correct functioning w.r.t. safety requirements. Promising techniques to
ensure safety seem to be model-driven engineering to develop hybrid systems in
a well-defined and traceable manner, and formal verification to prove their
correctness. Their combination forms the vision of verification-driven
engineering. Often, hybrid systems are rather complex in that they require
expertise from many domains (e.g., robotics, control systems, computer science,
software engineering, and mechanical engineering). Moreover, despite the
remarkable progress in automating formal verification of hybrid systems, the
construction of proofs of complex systems often requires nontrivial human
guidance, since hybrid systems verification tools solve undecidable problems.
It is, thus, not uncommon for development and verification teams to consist of
many players with diverse expertise. This paper introduces a
verification-driven engineering toolset that extends our previous work on
hybrid and arithmetic verification with tools for (i) graphical (UML) and
textual modeling of hybrid systems, (ii) exchanging and comparing models and
proofs, and (iii) managing verification tasks. This toolset makes it easier to
tackle large-scale verification tasks
- …