32 research outputs found
High-Dimensional Inference with the generalized Hopfield Model: Principal Component Analysis and Corrections
We consider the problem of inferring the interactions between a set of N
binary variables from the knowledge of their frequencies and pairwise
correlations. The inference framework is based on the Hopfield model, a special
case of the Ising model where the interaction matrix is defined through a set
of patterns in the variable space, and is of rank much smaller than N. We show
that Maximum Lik elihood inference is deeply related to Principal Component
Analysis when the amp litude of the pattern components, xi, is negligible
compared to N^1/2. Using techniques from statistical mechanics, we calculate
the corrections to the patterns to the first order in xi/N^1/2. We stress that
it is important to generalize the Hopfield model and include both attractive
and repulsive patterns, to correctly infer networks with sparse and strong
interactions. We present a simple geometrical criterion to decide how many
attractive and repulsive patterns should be considered as a function of the
sampling noise. We moreover discuss how many sampled configurations are
required for a good inference, as a function of the system size, N and of the
amplitude, xi. The inference approach is illustrated on synthetic and
biological data.Comment: Physical Review E: Statistical, Nonlinear, and Soft Matter Physics
(2011) to appea
The Bethe approximation for solving the inverse Ising problem: a comparison with other inference methods
The inverse Ising problem consists in inferring the coupling constants of an
Ising model given the correlation matrix. The fastest methods for solving this
problem are based on mean-field approximations, but which one performs better
in the general case is still not completely clear. In the first part of this
work, I summarize the formulas for several mean- field approximations and I
derive new analytical expressions for the Bethe approximation, which allow to
solve the inverse Ising problem without running the Susceptibility Propagation
algorithm (thus avoiding the lack of convergence). In the second part, I
compare the accuracy of different mean field approximations on several models
(diluted ferromagnets and spin glasses) defined on random graphs and regular
lattices, showing which one is in general more effective. A simple improvement
over these approximations is proposed. Also a fundamental limitation is found
in using methods based on TAP and Bethe approximations in presence of an
external field.Comment: v3: strongly revised version with new methods and results, 25 pages,
21 figure
Intrinsic limitations of inverse inference in the pairwise Ising spin glass
We analyze the limits inherent to the inverse reconstruction of a pairwise
Ising spin glass based on susceptibility propagation. We establish the
conditions under which the susceptibility propagation algorithm is able to
reconstruct the characteristics of the network given first- and second-order
local observables, evaluate eventual errors due to various types of noise in
the originally observed data, and discuss the scaling of the problem with the
number of degrees of freedom
On the criticality of inferred models
Advanced inference techniques allow one to reconstruct the pattern of
interaction from high dimensional data sets. We focus here on the statistical
properties of inferred models and argue that inference procedures are likely to
yield models which are close to a phase transition. On one side, we show that
the reparameterization invariant metrics in the space of probability
distributions of these models (the Fisher Information) is directly related to
the model's susceptibility. As a result, distinguishable models tend to
accumulate close to critical points, where the susceptibility diverges in
infinite systems. On the other, this region is the one where the estimate of
inferred parameters is most stable. In order to illustrate these points, we
discuss inference of interacting point processes with application to financial
data and show that sensible choices of observation time-scales naturally yield
models which are close to criticality.Comment: 6 pages, 2 figures, version to appear in JSTA
U.S. stock market interaction network as learned by the Boltzmann Machine
We study historical dynamics of joint equilibrium distribution of stock
returns in the U.S. stock market using the Boltzmann distribution model being
parametrized by external fields and pairwise couplings. Within Boltzmann
learning framework for statistical inference, we analyze historical behavior of
the parameters inferred using exact and approximate learning algorithms. Since
the model and inference methods require use of binary variables, effect of this
mapping of continuous returns to the discrete domain is studied. The presented
analysis shows that binarization preserves market correlation structure.
Properties of distributions of external fields and couplings as well as
industry sector clustering structure are studied for different historical dates
and moving window sizes. We found that a heavy positive tail in the
distribution of couplings is responsible for the sparse market clustering
structure. We also show that discrepancies between the model parameters might
be used as a precursor of financial instabilities.Comment: 15 pages, 17 figures, 1 tabl
Beyond inverse Ising model: structure of the analytical solution for a class of inverse problems
I consider the problem of deriving couplings of a statistical model from
measured correlations, a task which generalizes the well-known inverse Ising
problem. After reminding that such problem can be mapped on the one of
expressing the entropy of a system as a function of its corresponding
observables, I show the conditions under which this can be done without
resorting to iterative algorithms. I find that inverse problems are local (the
inverse Fisher information is sparse) whenever the corresponding models have a
factorized form, and the entropy can be split in a sum of small cluster
contributions. I illustrate these ideas through two examples (the Ising model
on a tree and the one-dimensional periodic chain with arbitrary order
interaction) and support the results with numerical simulations. The extension
of these methods to more general scenarios is finally discussed.Comment: 15 pages, 6 figure
Adaptive cluster expansion for the inverse Ising problem: convergence, algorithm and tests
We present a procedure to solve the inverse Ising problem, that is to find
the interactions between a set of binary variables from the measure of their
equilibrium correlations. The method consists in constructing and selecting
specific clusters of variables, based on their contributions to the
cross-entropy of the Ising model. Small contributions are discarded to avoid
overfitting and to make the computation tractable. The properties of the
cluster expansion and its performances on synthetic data are studied. To make
the implementation easier we give the pseudo-code of the algorithm.Comment: Paper submitted to Journal of Statistical Physic
Bethe-Peierls approximation and the inverse Ising model
We apply the Bethe-Peierls approximation to the problem of the inverse Ising
model and show how the linear response relation leads to a simple method to
reconstruct couplings and fields of the Ising model. This reconstruction is
exact on tree graphs, yet its computational expense is comparable to other
mean-field methods. We compare the performance of this method to the
independent-pair, naive mean- field, Thouless-Anderson-Palmer approximations,
the Sessak-Monasson expansion, and susceptibility propagation in the Cayley
tree, SK-model and random graph with fixed connectivity. At low temperatures,
Bethe reconstruction outperforms all these methods, while at high temperatures
it is comparable to the best method available so far (Sessak-Monasson). The
relationship between Bethe reconstruction and other mean- field methods is
discussed
Stimulus-dependent maximum entropy models of neural population codes
Neural populations encode information about their stimulus in a collective
fashion, by joint activity patterns of spiking and silence. A full account of
this mapping from stimulus to neural activity is given by the conditional
probability distribution over neural codewords given the sensory input. To be
able to infer a model for this distribution from large-scale neural recordings,
we introduce a stimulus-dependent maximum entropy (SDME) model---a minimal
extension of the canonical linear-nonlinear model of a single neuron, to a
pairwise-coupled neural population. The model is able to capture the
single-cell response properties as well as the correlations in neural spiking
due to shared stimulus and due to effective neuron-to-neuron connections. Here
we show that in a population of 100 retinal ganglion cells in the salamander
retina responding to temporal white-noise stimuli, dependencies between cells
play an important encoding role. As a result, the SDME model gives a more
accurate account of single cell responses and in particular outperforms
uncoupled models in reproducing the distributions of codewords emitted in
response to a stimulus. We show how the SDME model, in conjunction with static
maximum entropy models of population vocabulary, can be used to estimate
information-theoretic quantities like surprise and information transmission in
a neural population.Comment: 11 pages, 7 figure
Pairwise maximum entropy models for studying large biological systems: when they can and when they can't work
One of the most critical problems we face in the study of biological systems
is building accurate statistical descriptions of them. This problem has been
particularly challenging because biological systems typically contain large
numbers of interacting elements, which precludes the use of standard brute
force approaches. Recently, though, several groups have reported that there may
be an alternate strategy. The reports show that reliable statistical models can
be built without knowledge of all the interactions in a system; instead,
pairwise interactions can suffice. These findings, however, are based on the
analysis of small subsystems. Here we ask whether the observations will
generalize to systems of realistic size, that is, whether pairwise models will
provide reliable descriptions of true biological systems. Our results show
that, in most cases, they will not. The reason is that there is a crossover in
the predictive power of pairwise models: If the size of the subsystem is below
the crossover point, then the results have no predictive power for large
systems. If the size is above the crossover point, the results do have
predictive power. This work thus provides a general framework for determining
the extent to which pairwise models can be used to predict the behavior of
whole biological systems. Applied to neural data, the size of most systems
studied so far is below the crossover point