27 research outputs found
Atomic Resolution Imaging of Currents in Nanoscopic Quantum Networks via Scanning Tunneling Microscopy
We propose a new method for atomic-scale imaging of spatial current patterns
in nanoscopic quantum networks by using scanning tunneling microscopy (STM). By
measuring the current flowing from the STM tip into one of the leads attached
to the network as a function of tip position, one obtains an atomically
resolved spatial image of "current riverbeds" whose spatial structure reflects
the coherent flow of electrons out of equilibrium. We show that this method can
be successfully applied in variety of network topologies, and is robust against
dephasing effects.Comment: 5 page
Current Eigenmodes and Dephasing in Nanoscopic Quantum Networks
Using the non-equilibrium Keldysh Green's function formalism, we show that
the non-equilibrium charge transport in nanoscopic quantum networks takes place
via {\it current eigenmodes} that possess characteristic spatial patterns. We
identify the microscopic relation between the current patterns and the
network's electronic structure and topology and demonstrate that these patterns
can be selected via gating or constrictions, providing new venues for
manipulating charge transport at the nanoscale. Finally, decreasing the
dephasing time leads to a smooth evolution of the current patterns from those
of a ballistic quantum network to those of a classical resistor network.Comment: 6 pages, 4 figure
Flatter, faster: scaling momentum for optimal speedup of SGD
Commonly used optimization algorithms often show a trade-off between good
generalization and fast training times. For instance, stochastic gradient
descent (SGD) tends to have good generalization; however, adaptive gradient
methods have superior training times. Momentum can help accelerate training
with SGD, but so far there has been no principled way to select the momentum
hyperparameter. Here we study training dynamics arising from the interplay
between SGD with label noise and momentum in the training of overparametrized
neural networks. We find that scaling the momentum hyperparameter
with the learning rate to the power of maximally accelerates training,
without sacrificing generalization. To analytically derive this result we
develop an architecture-independent framework, where the main assumption is the
existence of a degenerate manifold of global minimizers, as is natural in
overparametrized models. Training dynamics display the emergence of two
characteristic timescales that are well-separated for generic values of the
hyperparameters. The maximum acceleration of training is reached when these two
timescales meet, which in turn determines the scaling limit we propose. We
confirm our scaling rule for synthetic regression problems (matrix sensing and
teacher-student paradigm) and classification for realistic datasets (ResNet-18
on CIFAR10, 6-layer MLP on FashionMNIST), suggesting the robustness of our
scaling rule to variations in architectures and datasets.Comment: v2: expanded introduction section, corrected minor typos. v1: 12+13
pages, 3 figure
Jo GULDI-David ARMITAGE, Tarih Manifestosu, Terc. Serpil Çağlayan, Türkiye İş Bankası Yayınları, İstanbul 2016. VII+182 sayfa
[Abtract Not Available
Trainability, Expressivity and Interpretability in Gated Neural ODEs
Understanding how the dynamics in biological and artificial neural networks
implement the computations required for a task is a salient open question in
machine learning and neuroscience. In particular, computations requiring
complex memory storage and retrieval pose a significant challenge for these
networks to implement or learn. Recently, a family of models described by
neural ordinary differential equations (nODEs) has emerged as powerful
dynamical neural network models capable of capturing complex dynamics. Here, we
extend nODEs by endowing them with adaptive timescales using gating
interactions. We refer to these as gated neural ODEs (gnODEs). Using a task
that requires memory of continuous quantities, we demonstrate the inductive
bias of the gnODEs to learn (approximate) continuous attractors. We further
show how reduced-dimensional gnODEs retain their modeling power while greatly
improving interpretability, even allowing explicit visualization of the
structure of learned attractors. We introduce a novel measure of expressivity
which probes the capacity of a neural network to generate complex trajectories.
Using this measure, we explore how the phase-space dimension of the nODEs and
the complexity of the function modeling the flow field contribute to
expressivity. We see that a more complex function for modeling the flow field
allows a lower-dimensional nODE to capture a given target dynamics. Finally, we
demonstrate the benefit of gating in nODEs on several real-world tasks
Using large language models to study human memory for meaningful narratives
One of the most impressive achievements of the AI revolution is the
development of large language models that can generate meaningful text and
respond to instructions in plain English with no additional training necessary.
Here we show that language models can be used as a scientific instrument for
studying human memory for meaningful material. We developed a pipeline for
designing large scale memory experiments and analyzing the obtained results. We
performed online memory experiments with a large number of participants and
collected recognition and recall data for narratives of different lengths. We
found that both recall and recognition performance scale linearly with
narrative length. Furthermore, in order to investigate the role of narrative
comprehension in memory, we repeated these experiments using scrambled versions
of the presented stories. We found that even though recall performance declined
significantly, recognition remained largely unaffected. Interestingly, recalls
in this condition seem to follow the original narrative order rather than the
scrambled presentation, pointing to a contextual reconstruction of the story in
memory.Comment: v2: 43 pages, with added discussion and a new appendix
Spectral Transitions and Universal Steady-States in Random Kraus Maps and Circuits
The study of dissipation and decoherence in generic open quantum systems
recently led to the investigation of spectral and steady-state properties of
random Lindbladian dynamics. A natural question is then how realistic and
universal those properties are. Here, we address these issues by considering a
different description of dissipative quantum systems, namely the discrete-time
Kraus map representation of completely positive quantum dynamics. Through
random matrix theory (RMT) techniques and numerical exact diagonalization, we
study random Kraus maps, allowing for a varying dissipation strength, and their
local circuit counterpart. We find the spectrum of the random Kraus map to be
either an annulus or a disk inside the unit circle in the complex plane, with a
transition between the two cases taking place at a critical value of
dissipation strength. The eigenvalue distribution and the spectral transition
are well described by a simplified RMT model that we can solve exactly in the
thermodynamic limit, by means of non-Hermitian RMT and quaternionic free
probability. The steady-state, on the contrary, is not affected by the spectral
transition. It has, however, a perturbative crossover regime at small
dissipation, inside which the steady-state is characterized by uncorrelated
eigenvalues. At large dissipation (or for any dissipation for a large-enough
system) the steady-state is well described by a random Wishart matrix. The
steady-state properties thus coincide with those already observed for random
Lindbladian dynamics, indicating their universality. Quite remarkably, the
statistical properties of the local Kraus circuit are qualitatively the same as
those of the nonlocal Kraus map, indicating that the latter, which is more
tractable, already captures the realistic and universal physical properties of
generic open quantum systems.Comment: 14 pages, 8 figure