Search CORE

145 research outputs found

Hamilton-Jacobi Theory and Information Geometry

Author: C Lanczos
D Petz
D Petz
F Barbaresco
G Marmo
G Marmo
G Morandi
JF Cariñena
JF Cariñena
NN Cencov
R Balian
R Balian
SI Amari
SI Amari
SI Amari
T Matumoto
VI Man’ko
WK Wootters
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Recently, a method to dynamically define a divergence function

D

for a given statistical manifold

(\mathcal{M}\,,g\,,T)

by means of the Hamilton-Jacobi theory associated with a suitable Lagrangian function

\mathfrak{L}

T\mathcal{M}

has been proposed. Here we will review this construction and lay the basis for an inverse problem where we assume the divergence function

D

to be known and we look for a Lagrangian function

\mathfrak{L}

for which

D

is a complete solution of the associated Hamilton-Jacobi theory. To apply these ideas to quantum systems, we have to replace probability distributions with probability amplitudes.Comment: 8 page

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Università degli studi di Napoli Federico II

A Formalization of The Natural Gradient Method for General Similarity Measures

Author: G Raskutti
M Agueh
N Parikh
NN Schraudolph
SI Amari
SI Amari
W Li
Y Saad
Publication venue
Publication date: 01/01/2019
Field of study

In optimization, the natural gradient method is well-known for likelihood maximization. The method uses the Kullback-Leibler divergence, corresponding infinitesimally to the Fisher-Rao metric, which is pulled back to the parameter space of a family of probability distributions. This way, gradients with respect to the parameters respect the Fisher-Rao geometry of the space of distributions, which might differ vastly from the standard Euclidean geometry of the parameter space, often leading to faster convergence. However, when minimizing an arbitrary similarity measure between distributions, it is generally unclear which metric to use. We provide a general framework that, given a similarity measure, derives a metric for the natural gradient. We then discuss connections between the natural gradient method and multiple other optimization techniques in the literature. Finally, we provide computations of the formal natural gradient to show overlap with well-known cases and to compute natural gradients in novel frameworks

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence

Author: AV Terekhov
JM Lee
R Hecht-Nielsen
S Kullback
SI Amari
Z Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/08/2018
Field of study

Incremental learning (IL) has received a lot of attention recently, however, the literature lacks a precise problem definition, proper evaluation settings, and metrics tailored specifically for the IL problem. One of the main objectives of this work is to fill these gaps so as to provide a common ground for better understanding of IL. The main challenge for an IL algorithm is to update the classifier whilst preserving existing knowledge. We observe that, in addition to forgetting, a known issue while preserving knowledge, IL also suffers from a problem we call intransigence, inability of a model to update its knowledge. We introduce two metrics to quantify forgetting and intransigence that allow us to understand, analyse, and gain better insights into the behaviour of IL algorithms. We present RWalk, a generalization of EWC++ (our efficient version of EWC [Kirkpatrick2016EWC]) and Path Integral [Zenke2017Continual] with a theoretically grounded KL-divergence based perspective. We provide a thorough analysis of various IL algorithms on MNIST and CIFAR-100 datasets. In these experiments, RWalk obtains superior results in terms of accuracy, and also provides a better trade-off between forgetting and intransigence

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

The Bregman chord divergence

Author: A Banerjee
A Banerjee
F Nielsen
F Nielsen
F Nielsen
F Nielsen
I Csiszár
I Csiszár
I Csiszár
J Burbea
J Jiao
J Naudts
J Zhang
JLWV Jensen
LM Bregman
M Basseville
M Broniatowski
MC Pardo
MM Deza
R Nock
S Amari
SI Amari
SM Ali
TM Cover
V Kac
W Stummer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/10/2018
Field of study

Distances are fundamental primitives whose choice significantly impacts the performances of algorithms in machine learning and signal processing. However selecting the most appropriate distance for a given task is an endeavor. Instead of testing one by one the entries of an ever-expanding dictionary of {\em ad hoc} distances, one rather prefers to consider parametric classes of distances that are exhaustively characterized by axioms derived from first principles. Bregman divergences are such a class. However fine-tuning a Bregman divergence is delicate since it requires to smoothly adjust a functional generator. In this work, we propose an extension of Bregman divergences called the Bregman chord divergences. This new class of distances does not require gradient calculations, uses two scalar parameters that can be easily tailored in applications, and generalizes asymptotically Bregman divergences.Comment: 10 page

arXiv.org e-Print Archive

Crossref

Pattern association and retrieval in a continuous neural system

Author: B Ermentrout
C Beaulieu
C Beaulieu
D Porter
E Thomas
EL Schwartz
ER Caianiello
GB Ermentrout
GB Ermentrout
HJ Chang
HR Wilson
Hung-Jen Chang
JA Anderson
JC Stanley
Joydeep Ghosh
K Fukushima
RD Traub
S Bochner
S Hochstein
SA Eillias
SI Amari
SI Amari
SI Amari
SI Amari
SM Blinkov
T Kohonen
U Wehmeier
VI Smirnov
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Learning Adaptive Regularization for Image Labeling Using Geometric Assignment

Author: A Griewank
E Haber
E Hairer
E Weinan
F Aström
F Åström
I Ross
J Kappes
L Wasserman
MB Giles
MJ Wainwright
SC Zhu
SI Amari
Publication venue
Publication date: 01/01/2019
Field of study

We study the inverse problem of model parameter learning for pixelwise image labeling, using the linear assignment flow and training data with ground truth. This is accomplished by a Riemannian gradient flow on the manifold of parameters that determine the regularization properties of the assignment flow. Using the symplectic partitioned Runge--Kutta method for numerical integration, it is shown that deriving the sensitivity conditions of the parameter learning problem and its discretization commute. A convenient property of our approach is that learning is based on exact inference. Carefully designed experiments demonstrate the performance of our approach, the expressiveness of the mathematical model as well as its limitations, from the viewpoint of statistical learning and optimal control

arXiv.org e-Print Archive

OPUS Augsburg

Crossref

Warped Riemannian metrics for location-scale models

Author: A Terras
AW Knapp
B O’Neill
B Unal
C Atkinson
D Wierstra
DG Luenbeger
E Liao
EH Spanier
EL Lehmann
EP Hsu
F Dobarro
G Cheng
G Gallavotti
GA Young
GJ McLachlan
GN Watson
I Chavel
J Bensadon
JE Marsden
JM Lee
KV Mardia
M Arnaudon
M Arnaudon
M Emery
MP Do Carmo
N Ay
N Ikeda
NN Chentsov
O Kallenberg
P Petersen
PA Absil
PE Kloeden
R. L. Bishop
S Amari
S Helgason
S Lang
S Said
S Said
SI Amari
X Pennec
Y Chikuse
Y Ollivier
Publication venue
Publication date: 22/07/2017
Field of study

The present paper shows that warped Riemannian metrics, a class of Riemannian metrics which play a prominent role in Riemannian geometry, are also of fundamental importance in information geometry. Precisely, the paper features a new theorem, which states that the Rao-Fisher information metric of any location-scale model, defined on a Riemannian manifold, is a warped Riemannian metric, whenever this model is invariant under the action of some Lie group. This theorem is a valuable tool in finding the expression of the Rao-Fisher information metric of location-scale models defined on high-dimensional Riemannian manifolds. Indeed, a warped Riemannian metric is fully determined by only two functions of a single variable, irrespective of the dimension of the underlying Riemannian manifold. Starting from this theorem, several original contributions are made. The expression of the Rao-Fisher information metric of the Riemannian Gaussian model is provided, for the first time in the literature. A generalised definition of the Mahalanobis distance is introduced, which is applicable to any location-scale model defined on a Riemannian manifold. The solution of the geodesic equation is obtained, for any Rao-Fisher information metric defined in terms of warped Riemannian metrics. Finally, using a mixture of analytical and numerical computations, it is shown that the parameter space of the von Mises-Fisher model of

n

-dimensional directional data, when equipped with its Rao-Fisher information metric, becomes a Hadamard manifold, a simply-connected complete Riemannian manifold of negative sectional curvature, for

n = 2,\ldots,8

. Hopefully, in upcoming work, this will be proved for any value of

n

.Comment: first version, before submissio

arXiv.org e-Print Archive

Crossref

Dependency-aware Attention Control for Unconstrained Face Recognition with Image Sets

Author: A Jourabloo
AM Andrew
D Chen
E Learned-Miller
H Li
I Masi
J Wright
K Arulkumaran
L Liu
LJ Lin
N Meuleau
RJ Williams
SI Amari
V Mnih
Y Wen
Publication venue
Publication date: 05/07/2019
Field of study

This paper targets the problem of image set-based face verification and identification. Unlike traditional single media (an image or video) setting, we encounter a set of heterogeneous contents containing orderless images and videos. The importance of each image is usually considered either equal or based on their independent quality assessment. How to model the relationship of orderless images within a set remains a challenge. We address this problem by formulating it as a Markov Decision Process (MDP) in the latent space. Specifically, we first present a dependency-aware attention control (DAC) network, which resorts to actor-critic reinforcement learning for sequential attention decision of each image embedding to fully exploit the rich correlation cues among the unordered images. Moreover, we introduce its sample-efficient variant with off-policy experience replay to speed up the learning process. The pose-guided representation scheme can further boost the performance at the extremes of the pose variation.Comment: Fixed the unreadable code in CVF version. arXiv admin note: text overlap with arXiv:1707.00130 by other author

arXiv.org e-Print Archive

Crossref

State-Space Analysis of Time-Varying Higher-Order Spike Correlation for Multiple Neural Spike Train Data

Author: A Kuhn
A Kumar
A Onken
A Riehle
A Riehle
A Tang
AC Harvey
AC Smith
AC Smith
AM Aertsen
AP Dempster
B Staude
B Staude
B Staude
BE Kilavik
CD Brody
D Ackley
D Berger
D Daley
DH Perkel
DO Hebb
E Ahissar
E Schneidman
E Schneidman
E Vaadia
Emery N. Brown
EN Brown
EN Brown
EN Brown
ES Chornoboy
F Montani
G Czanner
G Kitagawa
G Schwarz
GL Gerstein
GL Gerstein
GL Gerstein
GS Santos
H Akaike
H Fujii
H Ishikane
H Ito
H Jeffreys
H Nakahara
H Shimazaki
H Shimazaki
H Shimazaki
H Shimodaira
Hideaki Shimazaki
I Murray
IE Ohiorhenuan
IH Stevenson
IJ Good
J Durbin
J Hopfield
J Long II
J Rissanen
J Shlens
J Shlens
JE Kulkarni
JH Macke
JW Pillow
KD Harris
L Fahrmeir
L Martignon
L Martignon
L Paninski
M Abeles
M Diesmann
M Oizumi
M Okatan
M Vidne
MT Schaub
O Marre
Olaf Sporns
P Berkes
P De Jong
P Maldonado
PN Steinmetz
R Gütig
R Shumway
RE Kass
RE Kass
RE Kass
RL Jenison
S Amari
S Amari
S Amari
S Amari
S Amari
S Fujisawa
S Grün
S Grün
S Grün
S Grün
S Kim
S Koyama
S Louis
S Shinomoto
S Takahashi
S Yu
Shun-ichi Amari
SI Amari
SI Amari
SM Bohte
Sonja Grün
UT Eden
W Truccolo
W Truccolo
X Meng
Y Roudi
Y Sakurai
Y Sakurai
Y Sakurai
Z Chen
Publication venue: Public Library of Science
Publication date: 01/05/2011
Field of study

Precise spike coordination between the spiking activities of multiple neurons is suggested as an indication of coordinated network activity in active cell assemblies. Spike correlation analysis aims to identify such cooperative network activity by detecting excess spike synchrony in simultaneously recorded multiple neural spike sequences. Cooperative activity is expected to organize dynamically during behavior and cognition; therefore currently available analysis techniques must be extended to enable the estimation of multiple time-varying spike interactions between neurons simultaneously. In particular, new methods must take advantage of the simultaneous observations of multiple neurons by addressing their higher-order dependencies, which cannot be revealed by pairwise analyses alone. In this paper, we develop a method for estimating time-varying spike interactions by means of a state-space analysis. Discretized parallel spike sequences are modeled as multi-variate binary processes using a log-linear model that provides a well-defined measure of higher-order spike correlation in an information geometry framework. We construct a recursive Bayesian filter/smoother for the extraction of spike interaction parameters. This method can simultaneously estimate the dynamic pairwise spike interactions of multiple single neurons, thereby extending the Ising/spin-glass model analysis of multiple neural spike train data to a nonstationary analysis. Furthermore, the method can estimate dynamic higher-order spike interactions. To validate the inclusion of the higher-order terms in the model, we construct an approximation method to assess the goodness-of-fit to spike data. In addition, we formulate a test method for the presence of higher-order spike correlation even in nonstationary spike data, e.g., data from awake behaving animals. The utility of the proposed methods is tested using simulated spike data with known underlying correlation dynamics. Finally, we apply the methods to neural spike data simultaneously recorded from the motor cortex of an awake monkey and demonstrate that the higher-order spike correlation organizes dynamically in relation to a behavioral demand

Public Library of Science (PLOS)

DSpace@MIT

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

Publikationsserver der RWTH Aachen University

Juelich Shared Electronic Resources

Hyperbolic planforms in relation to visual edges and textures perception

Author: A Bachelot-Motet
A Grinvald
B Ermentrout
B Horn
D Ballard
D Edwards
D Field
D Field
D Forsyth
D Hansel
D Marr
E Kaplan
F Gantmacher
G Blasdel
G Blasdel
G Deco
G Iooss
G Iooss
G Rosenberger
H Iwaniec
H Knutsson
H Poincaré
H Wilson
I Gel'fand
J Allman
J Bigun
J Gilman
J Gilman
J Koenderink
J Petitot
J Petitot
Karl J. Friston
L Florack
L Sincich
M Field
M Golubitsky
M Livingstone
M Moakher
N Balazs
N Issa
O Ben-Shahar
O Ben-Shahar
O Ben-Shahar
O Faugeras
Olivier Faugeras
P Bressloff
P Bressloff
P Bressloff
P Bressloff
P Bressloff
P Chossat
Pascal Chossat
PC Bressloff
R Hummel
R Tootell
S Helgason
S Katok
S Lang
SI Amari
T Bonhoeffer
T Bonhoeffer
V Casagrande
W Pratt
Y Jiang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2009
Field of study

We propose to use bifurcation theory and pattern formation as theoretical probes for various hypotheses about the neural organization of the brain. This allows us to make predictions about the kinds of patterns that should be observed in the activity of real brains through, e.g. optical imaging, and opens the door to the design of experiments to test these hypotheses. We study the specific problem of visual edges and textures perception and suggest that these features may be represented at the population level in the visual cortex as a specific second-order tensor, the structure tensor, perhaps within a hypercolumn. We then extend the classical ring model to this case and show that its natural framework is the non-Euclidean hyperbolic geometry. This brings in the beautiful structure of its group of isometries and certain of its subgroups which have a direct interpretation in terms of the organization of the neural populations that are assumed to encode the structure tensor. By studying the bifurcations of the solutions of the structure tensor equations, the analog of the classical Wilson and Cowan equations, under the assumption of invariance with respect to the action of these subgroups, we predict the appearance of characteristic patterns. These patterns can be described by what we call hyperbolic or H-planforms that are reminiscent of Euclidean planar waves and of the planforms that were used in [1, 2] to account for some visual hallucinations. If these patterns could be observed through brain imaging techniques they would reveal the built-in or acquired invariance of the neural organization to the action of the corresponding subgroups.Comment: 34 pages, 11 figures, 2 table

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

HAL-UNICE

INRIA a CCSD electronic archive server

Directory of Open Access Journals

PubMed Central