Location of Repository

We consider the problem of inferring the interactions between a set of N binary variables from the knowledge of their frequencies and pairwise correlations. The inference framework is based on the Hopfield model, a special case of the Ising model where the interaction matrix is defined through a set of patterns in the variable space, and is of rank much smaller than N. We show that Maximum Lik elihood inference is deeply related to Principal Component Analysis when the amp litude of the pattern components, xi, is negligible compared to N^1/2. Using techniques from statistical mechanics, we calculate the corrections to the patterns to the first order in xi/N^1/2. We stress that it is important to generalize the Hopfield model and include both attractive and repulsive patterns, to correctly infer networks with sparse and strong interactions. We present a simple geometrical criterion to decide how many attractive and repulsive patterns should be considered as a function of the sampling noise. We moreover discuss how many sampled configurations are required for a good inference, as a function of the system size, N and of the amplitude, xi. The inference approach is illustrated on synthetic and biological data.Comment: Physical Review E: Statistical, Nonlinear, and Soft Matter Physics (2011) to appea

Topics:
Condensed Matter - Statistical Mechanics, Quantitative Biology - Quantitative Methods, Statistics - Machine Learning

Year: 2011

DOI identifier: 10.1103/PhysRevE.83.051123

OAI identifier:
oai:arXiv.org:1104.3665

Provided by:
arXiv.org e-Print Archive

Download PDF:To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.