Search CORE

12 research outputs found

Replica method for eigenvalues of real Wishart product matrices

Author: Pehlevan Cengiz
Zavatone-Veth Jacob A.
Publication venue
Publication date: 13/12/2022
Field of study

We show how the replica method can be used to compute the asymptotic eigenvalue spectrum of a real Wishart product matrix. For unstructured factors, this provides a compact, elementary derivation of a polynomial condition on the Stieltjes transform first proved by M{\"u}ller [IEEE Trans. Inf. Theory. 48, 2086-2091 (2002)]. We then show how this computation can be extended to ensembles where the factors have correlated rows. Finally, we derive polynomial conditions on the average values of the minimum and maximum eigenvalues, which match the results obtained by Akemann, Ipsen, and Kieburg [Phys. Rev. E 88, 052118 (2013)] for the complex Wishart product ensemble.Comment: 35 pages, 4 figure

arXiv.org e-Print Archive

Asymptotics of representation learning in finite Bayesian neural networks

Author: Canatar Abdulkadir
Pehlevan Cengiz
Ruben Benjamin S.
Zavatone-Veth Jacob A.
Publication venue
Publication date: 09/11/2021
Field of study

Recent works have suggested that finite Bayesian neural networks may sometimes outperform their infinite cousins because finite networks can flexibly adapt their internal representations. However, our theoretical understanding of how the learned hidden layer representations of finite networks differ from the fixed representations of infinite networks remains incomplete. Perturbative finite-width corrections to the network prior and posterior have been studied, but the asymptotics of learned features have not been fully characterized. Here, we argue that the leading finite-width corrections to the average feature kernels for any Bayesian network with linear readout and Gaussian likelihood have a largely universal form. We illustrate this explicitly for three tractable network architectures: deep linear fully-connected and convolutional networks, and networks with a single nonlinear hidden layer. Our results begin to elucidate how task-relevant learning signals shape the hidden layer representations of wide Bayesian neural networks.Comment: 13+28 pages, 4 figures; v3: extensive revision with improved exposition and new section on CNNs, accepted to NeurIPS 2021; v4: minor updates to supplemen

arXiv.org e-Print Archive

Long Sequence Hopfield Memory

Author: Chaudhry Hamza Tahir
Krotov Dmitry
Pehlevan Cengiz
Zavatone-Veth Jacob A.
Publication venue
Publication date: 02/11/2023
Field of study

Sequence memory is an essential attribute of natural and artificial intelligence that enables agents to encode, store, and retrieve complex sequences of stimuli and actions. Computational models of sequence memory have been proposed where recurrent Hopfield-like neural networks are trained with temporally asymmetric Hebbian rules. However, these networks suffer from limited sequence capacity (maximal length of the stored sequence) due to interference between the memories. Inspired by recent work on Dense Associative Memories, we expand the sequence capacity of these models by introducing a nonlinear interaction term, enhancing separation between the patterns. We derive novel scaling laws for sequence capacity with respect to network size, significantly outperforming existing scaling laws for models based on traditional Hopfield networks, and verify these theoretical results with numerical simulation. Moreover, we introduce a generalized pseudoinverse rule to recall sequences of highly correlated patterns. Finally, we extend this model to store sequences with variable timing between states' transitions and describe a biologically-plausible implementation, with connections to motor neuroscience.Comment: NeurIPS 2023 Camera-Ready, 41 page

arXiv.org e-Print Archive

Learning curves for deep structured Gaussian feature models

Author: Pehlevan Cengiz
Zavatone-Veth Jacob A.
Publication venue
Publication date: 17/05/2023
Field of study

In recent years, significant attention in deep learning theory has been devoted to analyzing the generalization performance of models with multiple layers of Gaussian random features. However, few works have considered the effect of feature anisotropy; most assume that features are generated using independent and identically distributed Gaussian weights. Here, we derive learning curves for models with many layers of structured Gaussian features. We show that allowing correlations between the rows of the first layer of features can aid generalization, while structure in later layers is generally detrimental. Our results shed light on how weight structure affects generalization in a simple class of solvable models.Comment: 28 pages, 3 figure

arXiv.org e-Print Archive

Tripod gait example data.

Author: Clark Damon
DeAngelis Brian D.
Zavatone-Veth Jacob A.
Publication venue
Publication date: 28/06/2019
Field of study

MATLAB version 7.3 .mat file, created using MATLAB 9.4

Dryad Digital Repository (Duke University)

Dataset of responses of freely-walking flies to translating random dot visual stimuli.

Author: Clark Damon
DeAngelis Brian D.
Zavatone-Veth Jacob A.
Publication venue
Publication date: 28/06/2019
Field of study

MATLAB version 7.3 .mat file, created using MATLAB 9.4

Dryad Digital Repository (Duke University)

Limb coordination model data

Author: Clark Damon
DeAngelis Brian D.
Zavatone-Veth Jacob A.
Publication venue
Publication date: 28/06/2019
Field of study

Dryad Digital Repository (Duke University)