139,763 research outputs found
Replica theory for learning curves for Gaussian processes on random graphs
Statistical physics approaches can be used to derive accurate predictions for
the performance of inference methods learning from potentially noisy data, as
quantified by the learning curve defined as the average error versus number of
training examples. We analyse a challenging problem in the area of
non-parametric inference where an effectively infinite number of parameters has
to be learned, specifically Gaussian process regression. When the inputs are
vertices on a random graph and the outputs noisy function values, we show that
replica techniques can be used to obtain exact performance predictions in the
limit of large graphs. The covariance of the Gaussian process prior is defined
by a random walk kernel, the discrete analogue of squared exponential kernels
on continuous spaces. Conventionally this kernel is normalised only globally,
so that the prior variance can differ between vertices; as a more principled
alternative we consider local normalisation, where the prior variance is
uniform
Olfactory learning alters navigation strategies and behavioral variability in C. elegans
Animals adjust their behavioral response to sensory input adaptively
depending on past experiences. The flexible brain computation is crucial for
survival and is of great interest in neuroscience. The nematode C. elegans
modulates its navigation behavior depending on the association of odor butanone
with food (appetitive training) or starvation (aversive training), and will
then climb up the butanone gradient or ignore it, respectively. However, the
exact change in navigation strategy in response to learning is still unknown.
Here we study the learned odor navigation in worms by combining precise
experimental measurement and a novel descriptive model of navigation. Our model
consists of two known navigation strategies in worms: biased random walk and
weathervaning. We infer weights on these strategies by applying the model to
worm navigation trajectories and the exact odor concentration it experiences.
Compared to naive worms, appetitive trained worms up-regulate the biased random
walk strategy, and aversive trained worms down-regulate the weathervaning
strategy. The statistical model provides prediction with accuracy of
the past training condition given navigation data, which outperforms the
classical chemotaxis metric. We find that the behavioral variability is altered
by learning, such that worms are less variable after training compared to naive
ones. The model further predicts the learning-dependent response and
variability under optogenetic perturbation of the olfactory neuron
AWC. Lastly, we investigate neural circuits downstream from
AWC that are differentially recruited for learned odor-guided
navigation. Together, we provide a new paradigm to quantify flexible navigation
algorithms and pinpoint the underlying neural substrates
Universal Graph Random Features
We propose a novel random walk-based algorithm for unbiased estimation of
arbitrary functions of a weighted adjacency matrix, coined universal graph
random features (u-GRFs). This includes many of the most popular examples of
kernels defined on the nodes of a graph. Our algorithm enjoys subquadratic time
complexity with respect to the number of nodes, overcoming the notoriously
prohibitive cubic scaling of exact graph kernel evaluation. It can also be
trivially distributed across machines, permitting learning on much larger
networks. At the heart of the algorithm is a modulation function which
upweights or downweights the contribution from different random walks depending
on their lengths. We show that by parameterising it with a neural network we
can obtain u-GRFs that give higher-quality kernel estimates or perform
efficient, scalable kernel learning. We provide robust theoretical analysis and
support our findings with experiments including pointwise estimation of fixed
graph kernels, solving non-homogeneous graph ordinary differential equations,
node clustering and kernel regression on triangular meshes
Bayesian Conditional Cointegration
Cointegration is an important topic for time-series, and describes a
relationship between two series in which a linear combination is stationary.
Classically, the test for cointegration is based on a two stage process in
which first the linear relation between the series is estimated by Ordinary
Least Squares. Subsequently a unit root test is performed on the residuals. A
well-known deficiency of this classical approach is that it can lead to
erroneous conclusions about the presence of cointegration. As an alternative,
we present a framework for estimating whether cointegration exists using
Bayesian inference which is empirically superior to the classical approach.
Finally, we apply our technique to model segmented cointegration in which
cointegration may exist only for limited time. In contrast to previous
approaches our model makes no restriction on the number of possible
cointegration segments.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
FrogWild! -- Fast PageRank Approximations on Graph Engines
We propose FrogWild, a novel algorithm for fast approximation of high
PageRank vertices, geared towards reducing network costs of running traditional
PageRank algorithms. Our algorithm can be seen as a quantized version of power
iteration that performs multiple parallel random walks over a directed graph.
One important innovation is that we introduce a modification to the GraphLab
framework that only partially synchronizes mirror vertices. This partial
synchronization vastly reduces the network traffic generated by traditional
PageRank algorithms, thus greatly reducing the per-iteration cost of PageRank.
On the other hand, this partial synchronization also creates dependencies
between the random walks used to estimate PageRank. Our main theoretical
innovation is the analysis of the correlations introduced by this partial
synchronization process and a bound establishing that our approximation is
close to the true PageRank vector.
We implement our algorithm in GraphLab and compare it against the default
PageRank implementation. We show that our algorithm is very fast, performing
each iteration in less than one second on the Twitter graph and can be up to 7x
faster compared to the standard GraphLab PageRank implementation
High-Dimensional Gaussian Graphical Model Selection: Walk Summability and Local Separation Criterion
We consider the problem of high-dimensional Gaussian graphical model
selection. We identify a set of graphs for which an efficient estimation
algorithm exists, and this algorithm is based on thresholding of empirical
conditional covariances. Under a set of transparent conditions, we establish
structural consistency (or sparsistency) for the proposed algorithm, when the
number of samples n=omega(J_{min}^{-2} log p), where p is the number of
variables and J_{min} is the minimum (absolute) edge potential of the graphical
model. The sufficient conditions for sparsistency are based on the notion of
walk-summability of the model and the presence of sparse local vertex
separators in the underlying graph. We also derive novel non-asymptotic
necessary conditions on the number of samples required for sparsistency
- …