734 research outputs found
Information based clustering
In an age of increasingly large data sets, investigators in many different
disciplines have turned to clustering as a tool for data analysis and
exploration. Existing clustering methods, however, typically depend on several
nontrivial assumptions about the structure of data. Here we reformulate the
clustering problem from an information theoretic perspective which avoids many
of these assumptions. In particular, our formulation obviates the need for
defining a cluster "prototype", does not require an a priori similarity metric,
is invariant to changes in the representation of the data, and naturally
captures non-linear relations. We apply this approach to different domains and
find that it consistently produces clusters that are more coherent than those
extracted by existing algorithms. Finally, our approach provides a way of
clustering based on collective notions of similarity rather than the
traditional pairwise measures.Comment: To appear in Proceedings of the National Academy of Sciences USA, 11
pages, 9 figure
On the criticality of inferred models
Advanced inference techniques allow one to reconstruct the pattern of
interaction from high dimensional data sets. We focus here on the statistical
properties of inferred models and argue that inference procedures are likely to
yield models which are close to a phase transition. On one side, we show that
the reparameterization invariant metrics in the space of probability
distributions of these models (the Fisher Information) is directly related to
the model's susceptibility. As a result, distinguishable models tend to
accumulate close to critical points, where the susceptibility diverges in
infinite systems. On the other, this region is the one where the estimate of
inferred parameters is most stable. In order to illustrate these points, we
discuss inference of interacting point processes with application to financial
data and show that sensible choices of observation time-scales naturally yield
models which are close to criticality.Comment: 6 pages, 2 figures, version to appear in JSTA
Information flow and optimization in transcriptional control
In the simplest view of transcriptional regulation, the expression of a gene
is turned on or off by changes in the concentration of a transcription factor
(TF). We use recent data on noise levels in gene expression to show that it
should be possible to transmit much more than just one regulatory bit.
Realizing this optimal information capacity would require that the dynamic
range of TF concentrations used by the cell, the input/output relation of the
regulatory module, and the noise levels of binding and transcription satisfy
certain matching relations. This parameter-free prediction is in good agreement
with recent experiments on the Bicoid/Hunchback system in the early Drosophila
embryo, and this system achieves ~90% of its theoretical maximum information
transmission.Comment: 5 pages, 4 figure
Transformation of stimulus correlations by the retina
Redundancies and correlations in the responses of sensory neurons seem to
waste neural resources but can carry cues about structured stimuli and may help
the brain to correct for response errors. To assess how the retina negotiates
this tradeoff, we measured simultaneous responses from populations of ganglion
cells presented with natural and artificial stimuli that varied greatly in
correlation structure. We found that pairwise correlations in the retinal
output remained similar across stimuli with widely different spatio-temporal
correlations including white noise and natural movies. Meanwhile, purely
spatial correlations tended to increase correlations in the retinal response.
Responding to more correlated stimuli, ganglion cells had faster temporal
kernels and tended to have stronger surrounds. These properties of individual
cells, along with gain changes that opposed changes in effective contrast at
the ganglion cell input, largely explained the similarity of pairwise
correlations across stimuli where receptive field measurements were possible.Comment: author list corrected in metadat
The Effect of Nonstationarity on Models Inferred from Neural Data
Neurons subject to a common non-stationary input may exhibit a correlated
firing behavior. Correlations in the statistics of neural spike trains also
arise as the effect of interaction between neurons. Here we show that these two
situations can be distinguished, with machine learning techniques, provided the
data are rich enough. In order to do this, we study the problem of inferring a
kinetic Ising model, stationary or nonstationary, from the available data. We
apply the inference procedure to two data sets: one from salamander retinal
ganglion cells and the other from a realistic computational cortical network
model. We show that many aspects of the concerted activity of the salamander
retinal neurons can be traced simply to the external input. A model of
non-interacting neurons subject to a non-stationary external field outperforms
a model with stationary input with couplings between neurons, even accounting
for the differences in the number of model parameters. When couplings are added
to the non-stationary model, for the retinal data, little is gained: the
inferred couplings are generally not significant. Likewise, the distribution of
the sizes of sets of neurons that spike simultaneously and the frequency of
spike patterns as function of their rank (Zipf plots) are well-explained by an
independent-neuron model with time-dependent external input, and adding
connections to such a model does not offer significant improvement. For the
cortical model data, robust couplings, well correlated with the real
connections, can be inferred using the non-stationary model. Adding connections
to this model slightly improves the agreement with the data for the probability
of synchronous spikes but hardly affects the Zipf plot.Comment: version in press in J Stat Mec
Intrinsic limitations of inverse inference in the pairwise Ising spin glass
We analyze the limits inherent to the inverse reconstruction of a pairwise
Ising spin glass based on susceptibility propagation. We establish the
conditions under which the susceptibility propagation algorithm is able to
reconstruct the characteristics of the network given first- and second-order
local observables, evaluate eventual errors due to various types of noise in
the originally observed data, and discuss the scaling of the problem with the
number of degrees of freedom
- …