28,901 research outputs found
Diffusion of Context and Credit Information in Markovian Models
This paper studies the problem of ergodicity of transition probability
matrices in Markovian models, such as hidden Markov models (HMMs), and how it
makes very difficult the task of learning to represent long-term context for
sequential data. This phenomenon hurts the forward propagation of long-term
context information, as well as learning a hidden state representation to
represent long-term context, which depends on propagating credit information
backwards in time. Using results from Markov chain theory, we show that this
problem of diffusion of context and credit is reduced when the transition
probabilities approach 0 or 1, i.e., the transition probability matrices are
sparse and the model essentially deterministic. The results found in this paper
apply to learning approaches based on continuous optimization, such as gradient
descent and the Baum-Welch algorithm.Comment: See http://www.jair.org/ for any accompanying file
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
Topological properties of hierarchical networks
Hierarchical networks are attracting a renewal interest for modelling the
organization of a number of biological systems and for tackling the complexity
of statistical mechanical models beyond mean-field limitations. Here we
consider the Dyson hierarchical construction for ferromagnets, neural networks
and spin-glasses, recently analyzed from a statistical-mechanics perspective,
and we focus on the topological properties of the underlying structures. In
particular, we find that such structures are weighted graphs that exhibit high
degree of clustering and of modularity, with small spectral gap; the robustness
of such features with respect to link removal is also studied. These outcomes
are then discussed and related to the statistical mechanics scenario in full
consistency. Lastly, we look at these weighted graphs as Markov chains and we
show that in the limit of infinite size, the emergence of ergodicity breakdown
for the stochastic process mirrors the emergence of meta-stabilities in the
corresponding statistical mechanical analysis
Predict or classify: The deceptive role of time-locking in brain signal classification
Several experimental studies claim to be able to predict the outcome of
simple decisions from brain signals measured before subjects are aware of their
decision. Often, these studies use multivariate pattern recognition methods
with the underlying assumption that the ability to classify the brain signal is
equivalent to predict the decision itself. Here we show instead that it is
possible to correctly classify a signal even if it does not contain any
predictive information about the decision. We first define a simple stochastic
model that mimics the random decision process between two equivalent
alternatives, and generate a large number of independent trials that contain no
choice-predictive information. The trials are first time-locked to the time
point of the final event and then classified using standard machine-learning
techniques. The resulting classification accuracy is above chance level long
before the time point of time-locking. We then analyze the same trials using
information theory. We demonstrate that the high classification accuracy is a
consequence of time-locking and that its time behavior is simply related to the
large relaxation time of the process. We conclude that when time-locking is a
crucial step in the analysis of neural activity patterns, both the emergence
and the timing of the classification accuracy are affected by structural
properties of the network that generates the signal.Comment: 23 pages, 5 figure
Deep Markov Random Field for Image Modeling
Markov Random Fields (MRFs), a formulation widely used in generative image
modeling, have long been plagued by the lack of expressive power. This issue is
primarily due to the fact that conventional MRFs formulations tend to use
simplistic factors to capture local patterns. In this paper, we move beyond
such limitations, and propose a novel MRF model that uses fully-connected
neurons to express the complex interactions among pixels. Through theoretical
analysis, we reveal an inherent connection between this model and recurrent
neural networks, and thereon derive an approximated feed-forward network that
couples multiple RNNs along opposite directions. This formulation combines the
expressive power of deep neural networks and the cyclic dependency structure of
MRF in a unified model, bringing the modeling capability to a new level. The
feed-forward approximation also allows it to be efficiently learned from data.
Experimental results on a variety of low-level vision tasks show notable
improvement over state-of-the-arts.Comment: Accepted at ECCV 201
- …