Search CORE

7,658 research outputs found

Inference via low-dimensional couplings

Author: Bigoni Daniele
Marzouk Youssef
Spantini Alessio
Publication venue
Publication date: 01/07/2018
Field of study

We investigate the low-dimensional structure of deterministic transformations between random variables, i.e., transport maps between probability measures. In the context of statistics and machine learning, these transformations can be used to couple a tractable "reference" measure (e.g., a standard Gaussian) with a target measure of interest. Direct simulation from the desired measure can then be achieved by pushing forward reference samples through the map. Yet characterizing such a map---e.g., representing and evaluating it---grows challenging in high dimensions. The central contribution of this paper is to establish a link between the Markov properties of the target measure and the existence of low-dimensional couplings, induced by transport maps that are sparse and/or decomposable. Our analysis not only facilitates the construction of transformations in high-dimensional settings, but also suggests new inference methodologies for continuous non-Gaussian graphical models. For instance, in the context of nonlinear state-space models, we describe new variational algorithms for filtering, smoothing, and sequential parameter inference. These algorithms can be understood as the natural generalization---to the non-Gaussian case---of the square-root Rauch-Tung-Striebel Gaussian smoother.Comment: 78 pages, 25 figure

arXiv.org e-Print Archive

DSpace@MIT

Beyond inverse Ising model: structure of the analytical solution for a class of inverse problems

Author: A. Braunstein
A. Tarantola
D. Ackley
D. Lachapelle de
E. Aurell
E. Aurell
E. Jaynes
E. Jaynes
E. Marinari
E. Moro
E. Schneidman
F. Lillo
F. Ricci-Tersenghi
G. Gori
H. Kappen
H. Nguyen
Iacopo Mastromatteo
J. Shlens
M. Bailly-Bechet
M. Mézard
M. Mézard
M. Socolich
M. Wainwright
M. Wainwright
M. Weigt
M. Welling
S. Cocco
S. Cocco
S. Cocco
T. Cover
T. Tanaka
V. Sessak
Y. Roudi
Y. Roudi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/09/2012
Field of study

I consider the problem of deriving couplings of a statistical model from measured correlations, a task which generalizes the well-known inverse Ising problem. After reminding that such problem can be mapped on the one of expressing the entropy of a system as a function of its corresponding observables, I show the conditions under which this can be done without resorting to iterative algorithms. I find that inverse problems are local (the inverse Fisher information is sparse) whenever the corresponding models have a factorized form, and the entropy can be split in a sum of small cluster contributions. I illustrate these ideas through two examples (the Ising model on a tree and the one-dimensional periodic chain with arbitrary order interaction) and support the results with numerical simulations. The extension of these methods to more general scenarios is finally discussed.Comment: 15 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Selection of sequence motifs and generative Hopfield-Potts models for protein familiesilies

Author: Shimagaki Kai
Weigt Martin
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2019
Field of study

Statistical models for families of evolutionary related proteins have recently gained interest: in particular pairwise Potts models, as those inferred by the Direct-Coupling Analysis, have been able to extract information about the three-dimensional structure of folded proteins, and about the effect of amino-acid substitutions in proteins. These models are typically requested to reproduce the one- and two-point statistics of the amino-acid usage in a protein family, {\em i.e.}~to capture the so-called residue conservation and covariation statistics of proteins of common evolutionary origin. Pairwise Potts models are the maximum-entropy models achieving this. While being successful, these models depend on huge numbers of {\em ad hoc} introduced parameters, which have to be estimated from finite amount of data and whose biophysical interpretation remains unclear. Here we propose an approach to parameter reduction, which is based on selecting collective sequence motifs. It naturally leads to the formulation of statistical sequence models in terms of Hopfield-Potts models. These models can be accurately inferred using a mapping to restricted Boltzmann machines and persistent contrastive divergence. We show that, when applied to protein data, even 20-40 patterns are sufficient to obtain statistically close-to-generative models. The Hopfield patterns form interpretable sequence motifs and may be used to clusterize amino-acid sequences into functional sub-families. However, the distributed collective nature of these motifs intrinsically limits the ability of Hopfield-Potts models in predicting contact maps, showing the necessity of developing models going beyond the Hopfield-Potts models discussed here.Comment: 26 pages, 16 figures, to app. in PR

arXiv.org e-Print Archive

HAL Descartes

HAL-INSU

Hal-Diderot

Inverse Statistical Physics of Protein Sequences: A Key Issues Review

Author: Cocco Simona
Feinauer Christoph
Figliuzzi Matteo
Monasson Remi
Weigt Martin
Publication venue: 'IOP Publishing'
Publication date: 03/03/2017
Field of study

In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e.~evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.Comment: 18 pages, 7 figure

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Bocconi

HAL-Inserm

HAL-INSU

From principal component to direct coupling analysis of coevolution in proteins: Low-eigenvalue modes are needed for structure prediction

Author: Cocco Simona
Monasson Remi
Weigt Martin
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 22/08/2013
Field of study

Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant 'patterns' of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold.Comment: Supporting information can be downloaded from: http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.100317

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

Hal-Diderot

FigShare

Dynamical preparation of EPR entanglement in two-well Bose-Einstein condensates

Author: B. Opanchuk
H. Risken
J. S. Bell
M. D. Reid
M. D. Reid
P. D. Drummond
Q. Y. He
R. Graham
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2012
Field of study

We propose to generate Einstein-Podolsky-Rosen (EPR) entanglement between groups of atoms in a two-well Bose-Einstein condensate using a dynamical process similar to that employed in quantum optics. The local nonlinear S-wave scattering interaction has the effect of creating a spin squeezing at each well, while the tunneling, analogous to a beam splitter in optics, introduces an interference between these fields that results in an inter-well entanglement. We consider two internal modes at each well, so that the entanglement can be detected by measuring a reduction in the variances of the sums of local Schwinger spin observables. As is typical of continuous variable (CV) entanglement, the entanglement is predicted to increase with atom number, and becomes sufficiently strong at higher numbers of atoms that the EPR paradox and steering non-locality can be realized. The entanglement is predicted using an analytical approach and, for larger atom numbers, stochastic simulations based on truncated Wigner function. We find generally that strong tunnelling is favourable, and that entanglement persists and is even enhanced in the presence of realistic nonlinear losses.Comment: 15 pages, 19 figure

arXiv.org e-Print Archive

Crossref

Swinburne Research Bank

Statistical Physics and Representations in Real and Artificial Neural Networks

Author: Cocco Simona
Monasson Rémi
Posani Lorenzo
Rosay Sophie
Tubiana Jérôme
Publication venue: 'Elsevier BV'
Publication date: 07/09/2017
Field of study

This document presents the material of two lectures on statistical physics and neural representations, delivered by one of us (R.M.) at the Fundamental Problems in Statistical Physics XIV summer school in July 2017. In a first part, we consider the neural representations of space (maps) in the hippocampus. We introduce an extension of the Hopfield model, able to store multiple spatial maps as continuous, finite-dimensional attractors. The phase diagram and dynamical properties of the model are analyzed. We then show how spatial representations can be dynamically decoded using an effective Ising model capturing the correlation structure in the neural data, and compare applications to data obtained from hippocampal multi-electrode recordings and by (sub)sampling our attractor model. In a second part, we focus on the problem of learning data representations in machine learning, in particular with artificial neural networks. We start by introducing data representations through some illustrations. We then analyze two important algorithms, Principal Component Analysis and Restricted Boltzmann Machines, with tools from statistical physics

arXiv.org e-Print Archive

Hal-Diderot

On the criticality of inferred models

Author: Amari S I
Atkinson A C
Bauwens L
Bowsher C G
Braunstein A
de Lachapelle D M
Hawkes A G
Iacopo Mastromatteo
Lillo F
Marsili M Mastromatteo I
Matteo Marsili
Mora T Bialek W
Rieke F
Sessak V
Stephens G J Mora T Tkacik G Bialek W
Tkacik G
Publication venue: 'IOP Publishing'
Publication date: 20/09/2011
Field of study

Advanced inference techniques allow one to reconstruct the pattern of interaction from high dimensional data sets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to a phase transition. On one side, we show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher Information) is directly related to the model's susceptibility. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. On the other, this region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time-scales naturally yield models which are close to criticality.Comment: 6 pages, 2 figures, version to appear in JSTA

arXiv.org e-Print Archive

Crossref