27 research outputs found

    Atomic Resolution Imaging of Currents in Nanoscopic Quantum Networks via Scanning Tunneling Microscopy

    Full text link
    We propose a new method for atomic-scale imaging of spatial current patterns in nanoscopic quantum networks by using scanning tunneling microscopy (STM). By measuring the current flowing from the STM tip into one of the leads attached to the network as a function of tip position, one obtains an atomically resolved spatial image of "current riverbeds" whose spatial structure reflects the coherent flow of electrons out of equilibrium. We show that this method can be successfully applied in variety of network topologies, and is robust against dephasing effects.Comment: 5 page

    Flatter, faster: scaling momentum for optimal speedup of SGD

    Full text link
    Commonly used optimization algorithms often show a trade-off between good generalization and fast training times. For instance, stochastic gradient descent (SGD) tends to have good generalization; however, adaptive gradient methods have superior training times. Momentum can help accelerate training with SGD, but so far there has been no principled way to select the momentum hyperparameter. Here we study training dynamics arising from the interplay between SGD with label noise and momentum in the training of overparametrized neural networks. We find that scaling the momentum hyperparameter 1β1-\beta with the learning rate to the power of 2/32/3 maximally accelerates training, without sacrificing generalization. To analytically derive this result we develop an architecture-independent framework, where the main assumption is the existence of a degenerate manifold of global minimizers, as is natural in overparametrized models. Training dynamics display the emergence of two characteristic timescales that are well-separated for generic values of the hyperparameters. The maximum acceleration of training is reached when these two timescales meet, which in turn determines the scaling limit we propose. We confirm our scaling rule for synthetic regression problems (matrix sensing and teacher-student paradigm) and classification for realistic datasets (ResNet-18 on CIFAR10, 6-layer MLP on FashionMNIST), suggesting the robustness of our scaling rule to variations in architectures and datasets.Comment: v2: expanded introduction section, corrected minor typos. v1: 12+13 pages, 3 figure

    Trainability, Expressivity and Interpretability in Gated Neural ODEs

    Full text link
    Understanding how the dynamics in biological and artificial neural networks implement the computations required for a task is a salient open question in machine learning and neuroscience. In particular, computations requiring complex memory storage and retrieval pose a significant challenge for these networks to implement or learn. Recently, a family of models described by neural ordinary differential equations (nODEs) has emerged as powerful dynamical neural network models capable of capturing complex dynamics. Here, we extend nODEs by endowing them with adaptive timescales using gating interactions. We refer to these as gated neural ODEs (gnODEs). Using a task that requires memory of continuous quantities, we demonstrate the inductive bias of the gnODEs to learn (approximate) continuous attractors. We further show how reduced-dimensional gnODEs retain their modeling power while greatly improving interpretability, even allowing explicit visualization of the structure of learned attractors. We introduce a novel measure of expressivity which probes the capacity of a neural network to generate complex trajectories. Using this measure, we explore how the phase-space dimension of the nODEs and the complexity of the function modeling the flow field contribute to expressivity. We see that a more complex function for modeling the flow field allows a lower-dimensional nODE to capture a given target dynamics. Finally, we demonstrate the benefit of gating in nODEs on several real-world tasks

    Using large language models to study human memory for meaningful narratives

    Full text link
    One of the most impressive achievements of the AI revolution is the development of large language models that can generate meaningful text and respond to instructions in plain English with no additional training necessary. Here we show that language models can be used as a scientific instrument for studying human memory for meaningful material. We developed a pipeline for designing large scale memory experiments and analyzing the obtained results. We performed online memory experiments with a large number of participants and collected recognition and recall data for narratives of different lengths. We found that both recall and recognition performance scale linearly with narrative length. Furthermore, in order to investigate the role of narrative comprehension in memory, we repeated these experiments using scrambled versions of the presented stories. We found that even though recall performance declined significantly, recognition remained largely unaffected. Interestingly, recalls in this condition seem to follow the original narrative order rather than the scrambled presentation, pointing to a contextual reconstruction of the story in memory.Comment: v2: 43 pages, with added discussion and a new appendix

    Spectral Transitions and Universal Steady-States in Random Kraus Maps and Circuits

    Full text link
    The study of dissipation and decoherence in generic open quantum systems recently led to the investigation of spectral and steady-state properties of random Lindbladian dynamics. A natural question is then how realistic and universal those properties are. Here, we address these issues by considering a different description of dissipative quantum systems, namely the discrete-time Kraus map representation of completely positive quantum dynamics. Through random matrix theory (RMT) techniques and numerical exact diagonalization, we study random Kraus maps, allowing for a varying dissipation strength, and their local circuit counterpart. We find the spectrum of the random Kraus map to be either an annulus or a disk inside the unit circle in the complex plane, with a transition between the two cases taking place at a critical value of dissipation strength. The eigenvalue distribution and the spectral transition are well described by a simplified RMT model that we can solve exactly in the thermodynamic limit, by means of non-Hermitian RMT and quaternionic free probability. The steady-state, on the contrary, is not affected by the spectral transition. It has, however, a perturbative crossover regime at small dissipation, inside which the steady-state is characterized by uncorrelated eigenvalues. At large dissipation (or for any dissipation for a large-enough system) the steady-state is well described by a random Wishart matrix. The steady-state properties thus coincide with those already observed for random Lindbladian dynamics, indicating their universality. Quite remarkably, the statistical properties of the local Kraus circuit are qualitatively the same as those of the nonlocal Kraus map, indicating that the latter, which is more tractable, already captures the realistic and universal physical properties of generic open quantum systems.Comment: 14 pages, 8 figure
    corecore