163 research outputs found
Rethinking the Inception Architecture for Computer Vision
Convolutional networks are at the core of most stateof-the-art
computer vision solutions for a wide variety of
tasks. Since 2014 very deep convolutional networks started
to become mainstream, yielding substantial gains in various
benchmarks. Although increased model size and computational
cost tend to translate to immediate quality gains
for most tasks (as long as enough labeled data is provided
for training), computational efficiency and low parameter
count are still enabling factors for various use cases such as
mobile vision and big-data scenarios. Here we are exploring
ways to scale up networks in ways that aim at utilizing
the added computation as efficiently as possible by suitably
factorized convolutions and aggressive regularization. We
benchmark our methods on the ILSVRC 2012 classification
challenge validation set demonstrate substantial gains over
the state of the art: 21.2% top-1 and 5.6% top-5 error for
single frame evaluation using a network with a computational
cost of 5 billion multiply-adds per inference and with
using less than 25 million parameters. With an ensemble of
4 models and multi-crop evaluation, we report 3.5% top-5
error and 17.3% top-1 error
Searching for simplicity: Approaches to the analysis of neurons and behavior
What fascinates us about animal behavior is its richness and complexity, but
understanding behavior and its neural basis requires a simpler description.
Traditionally, simplification has been imposed by training animals to engage in
a limited set of behaviors, by hand scoring behaviors into discrete classes, or
by limiting the sensory experience of the organism. An alternative is to ask
whether we can search through the dynamics of natural behaviors to find
explicit evidence that these behaviors are simpler than they might have been.
We review two mathematical approaches to simplification, dimensionality
reduction and the maximum entropy method, and we draw on examples from
different levels of biological organization, from the crawling behavior of C.
elegans to the control of smooth pursuit eye movements in primates, and from
the coding of natural scenes by networks of neurons in the retina to the rules
of English spelling. In each case, we argue that the explicit search for
simplicity uncovers new and unexpected features of the biological system, and
that the evidence for simplification gives us a language with which to phrase
new questions for the next generation of experiments. The fact that similar
mathematical structures succeed in taming the complexity of very different
biological systems hints that there is something more general to be discovered
A Tutorial on the Proper Orthogonal Decomposition
This tutorial introduces the Proper Orthogonal Decomposition (POD) to engineering students and researchers interested in its use in fluid dynamics and aerodynamics. The objectives are firstly to give an intuitive feel for the method and secondly to provide example MATLAB codes of common POD algorithms. The discussion is limited to the finite-dimensional case and only requires knowledge of basic statistics and matrix algebra. The POD is first introduced with a two-dimensional example in order to illustrate the different projections that take place in the decomposition. The n-dimensional case is then developed using experimental data obtained in a turbulent separation-bubble flow and numerical results from simulations of a cylinder wake flow
Stimulus-dependent maximum entropy models of neural population codes
Neural populations encode information about their stimulus in a collective
fashion, by joint activity patterns of spiking and silence. A full account of
this mapping from stimulus to neural activity is given by the conditional
probability distribution over neural codewords given the sensory input. To be
able to infer a model for this distribution from large-scale neural recordings,
we introduce a stimulus-dependent maximum entropy (SDME) model---a minimal
extension of the canonical linear-nonlinear model of a single neuron, to a
pairwise-coupled neural population. The model is able to capture the
single-cell response properties as well as the correlations in neural spiking
due to shared stimulus and due to effective neuron-to-neuron connections. Here
we show that in a population of 100 retinal ganglion cells in the salamander
retina responding to temporal white-noise stimuli, dependencies between cells
play an important encoding role. As a result, the SDME model gives a more
accurate account of single cell responses and in particular outperforms
uncoupled models in reproducing the distributions of codewords emitted in
response to a stimulus. We show how the SDME model, in conjunction with static
maximum entropy models of population vocabulary, can be used to estimate
information-theoretic quantities like surprise and information transmission in
a neural population.Comment: 11 pages, 7 figure
Pairwise maximum entropy models for studying large biological systems: when they can and when they can't work
One of the most critical problems we face in the study of biological systems
is building accurate statistical descriptions of them. This problem has been
particularly challenging because biological systems typically contain large
numbers of interacting elements, which precludes the use of standard brute
force approaches. Recently, though, several groups have reported that there may
be an alternate strategy. The reports show that reliable statistical models can
be built without knowledge of all the interactions in a system; instead,
pairwise interactions can suffice. These findings, however, are based on the
analysis of small subsystems. Here we ask whether the observations will
generalize to systems of realistic size, that is, whether pairwise models will
provide reliable descriptions of true biological systems. Our results show
that, in most cases, they will not. The reason is that there is a crossover in
the predictive power of pairwise models: If the size of the subsystem is below
the crossover point, then the results have no predictive power for large
systems. If the size is above the crossover point, the results do have
predictive power. This work thus provides a general framework for determining
the extent to which pairwise models can be used to predict the behavior of
whole biological systems. Applied to neural data, the size of most systems
studied so far is below the crossover point
Beyond inverse Ising model: structure of the analytical solution for a class of inverse problems
I consider the problem of deriving couplings of a statistical model from
measured correlations, a task which generalizes the well-known inverse Ising
problem. After reminding that such problem can be mapped on the one of
expressing the entropy of a system as a function of its corresponding
observables, I show the conditions under which this can be done without
resorting to iterative algorithms. I find that inverse problems are local (the
inverse Fisher information is sparse) whenever the corresponding models have a
factorized form, and the entropy can be split in a sum of small cluster
contributions. I illustrate these ideas through two examples (the Ising model
on a tree and the one-dimensional periodic chain with arbitrary order
interaction) and support the results with numerical simulations. The extension
of these methods to more general scenarios is finally discussed.Comment: 15 pages, 6 figure
Effect of coupling asymmetry on mean-field solutions of direct and inverse Sherrington-Kirkpatrick model
We study how the degree of symmetry in the couplings influences the
performance of three mean field methods used for solving the direct and inverse
problems for generalized Sherrington-Kirkpatrick models. In this context, the
direct problem is predicting the potentially time-varying magnetizations. The
three theories include the first and second order Plefka expansions, referred
to as naive mean field (nMF) and TAP, respectively, and a mean field theory
which is exact for fully asymmetric couplings. We call the last of these simply
MF theory. We show that for the direct problem, nMF performs worse than the
other two approximations, TAP outperforms MF when the coupling matrix is nearly
symmetric, while MF works better when it is strongly asymmetric. For the
inverse problem, MF performs better than both TAP and nMF, although an ad hoc
adjustment of TAP can make it comparable to MF. For high temperatures the
performance of TAP and MF approach each other
- …