Search CORE

1,897 research outputs found

Learning The Sequential Temporal Information with Recurrent Neural Networks

Author: Murugan Pushparaja
Publication venue
Publication date: 08/07/2018
Field of study

Recurrent Networks are one of the most powerful and promising artificial neural network algorithms to processing the sequential data such as natural languages, sound, time series data. Unlike traditional feed-forward network, Recurrent Network has a inherent feed back loop that allows to store the temporal context information and pass the state of information to the entire sequences of the events. This helps to achieve the state of art performance in many important tasks such as language modeling, stock market prediction, image captioning, speech recognition, machine translation and object tracking etc., However, training the fully connected RNN and managing the gradient flow are the complicated process. Many studies are carried out to address the mentioned limitation. This article is intent to provide the brief details about recurrent neurons, its variances and trips & tricks to train the fully recurrent neural network. This review work is carried out as a part of our IPO studio software module 'Multiple Object Tracking'.Comment: 17 page

arXiv.org e-Print Archive

Memory and attention in deep learning

Author: Le Hung
Publication venue
Publication date: 03/07/2021
Field of study

Intelligence necessitates memory. Without memory, humans fail to perform various nontrivial tasks such as reading novels, playing games or solving maths. As the ultimate goal of machine learning is to derive intelligent systems that learn and act automatically just like human, memory construction for machine is inevitable. Artificial neural networks model neurons and synapses in the brain by interconnecting computational units via weights, which is a typical class of machine learning algorithms that resembles memory structure. Their descendants with more complicated modeling techniques (a.k.a deep learning) have been successfully applied to many practical problems and demonstrated the importance of memory in the learning process of machinery systems. Recent progresses on modeling memory in deep learning have revolved around external memory constructions, which are highly inspired by computational Turing models and biological neuronal systems. Attention mechanisms are derived to support acquisition and retention operations on the external memory. Despite the lack of theoretical foundations, these approaches have shown promises to help machinery systems reach a higher level of intelligence. The aim of this thesis is to advance the understanding on memory and attention in deep learning. Its contributions include: (i) presenting a collection of taxonomies for memory, (ii) constructing new memory-augmented neural networks (MANNs) that support multiple control and memory units, (iii) introducing variability via memory in sequential generative models, (iv) searching for optimal writing operations to maximise the memorisation capacity in slot-based memory networks, and (v) simulating the Universal Turing Machine via Neural Stored-program Memory-a new kind of external memory for neural networks.Comment: PHD Thesi

arXiv.org e-Print Archive

Convolutional Bipartite Attractor Networks

Author: Iuzzolino Michael
Mozer Michael C.
Singer Yoram
Publication venue
Publication date: 26/09/2019
Field of study

In human perception and cognition, a fundamental operation that brains perform is interpretation: constructing coherent neural states from noisy, incomplete, and intrinsically ambiguous evidence. The problem of interpretation is well matched to an early and often overlooked architecture, the attractor network---a recurrent neural net that performs constraint satisfaction, imputation of missing features, and clean up of noisy data via energy minimization dynamics. We revisit attractor nets in light of modern deep learning methods and propose a convolutional bipartite architecture with a novel training loss, activation function, and connectivity constraints. We tackle larger problems than have been previously explored with attractor nets and demonstrate their potential for image completion and super-resolution. We argue that this architecture is better motivated than ever-deeper feedforward models and is a viable alternative to more costly sampling-based generative methods on a range of supervised and unsupervised tasks

arXiv.org e-Print Archive

An Approximate Backpropagation Learning Rule for Memristor Based Neural Networks Using Synaptic Plasticity

Author: Dunin-Barkowski W. L.
Karandashev I. M.
Matveyev Yu. A.
Negrov D. V.
Shakirov V. V.
Zenkevich A. V.
Publication venue
Publication date: 27/07/2016
Field of study

We describe an approximation to backpropagation algorithm for training deep neural networks, which is designed to work with synapses implemented with memristors. The key idea is to represent the values of both the input signal and the backpropagated delta value with a series of pulses that trigger multiple positive or negative updates of the synaptic weight, and to use the min operation instead of the product of the two signals. In computational simulations, we show that the proposed approximation to backpropagation is well converged and may be suitable for memristor implementations of multilayer neural networks.Comment: 21 pages, 6 figures, 1 table, title changed, manuscript thoroughly rewritte

arXiv.org e-Print Archive

Deep Learning in Neural Networks: An Overview

Author: Schmidhuber Juergen
Publication venue: 'Elsevier BV'
Publication date: 08/10/2014
Field of study

In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.Comment: 88 pages, 888 reference

arXiv.org e-Print Archive

Speech and neural network dynamics

Author: Renals Stephen John
Publication venue: The University of Edinburgh
Publication date: 01/01/1990
Field of study

Machine learning \& artificial intelligence in the quantum domain

Author: Briegel Hans J.
Dunjko Vedran
Publication venue
Publication date: 08/09/2017
Field of study

Quantum information technologies, and intelligent learning systems, are both emergent technologies that will likely have a transforming impact on our society. The respective underlying fields of research -- quantum information (QI) versus machine learning (ML) and artificial intelligence (AI) -- have their own specific challenges, which have hitherto been investigated largely independently. However, in a growing body of recent work, researchers have been probing the question to what extent these fields can learn and benefit from each other. QML explores the interaction between quantum computing and ML, investigating how results and techniques from one field can be used to solve the problems of the other. Recently, we have witnessed breakthroughs in both directions of influence. For instance, quantum computing is finding a vital application in providing speed-ups in ML, critical in our "big data" world. Conversely, ML already permeates cutting-edge technologies, and may become instrumental in advanced quantum technologies. Aside from quantum speed-up in data analysis, or classical ML optimization used in quantum experiments, quantum enhancements have also been demonstrated for interactive learning, highlighting the potential of quantum-enhanced learning agents. Finally, works exploring the use of AI for the very design of quantum experiments, and for performing parts of genuine research autonomously, have reported their first successes. Beyond the topics of mutual enhancement, researchers have also broached the fundamental issue of quantum generalizations of ML/AI concepts. This deals with questions of the very meaning of learning and intelligence in a world that is described by quantum mechanics. In this review, we describe the main ideas, recent developments, and progress in a broad spectrum of research investigating machine learning and artificial intelligence in the quantum domain.Comment: Review paper. 106 pages. 16 figure

arXiv.org e-Print Archive

Analog Photonics Computing for Information Processing, Inference and Optimisation

Author: Berloff Natalia G.
Stroev Nikita
Publication venue
Publication date: 05/06/2023
Field of study

This review presents an overview of the current state-of-the-art in photonics computing, which leverages photons, photons coupled with matter, and optics-related technologies for effective and efficient computational purposes. It covers the history and development of photonics computing and modern analogue computing platforms and architectures, focusing on optimization tasks and neural network implementations. The authors examine special-purpose optimizers, mathematical descriptions of photonics optimizers, and their various interconnections. Disparate applications are discussed, including direct encoding, logistics, finance, phase retrieval, machine learning, neural networks, probabilistic graphical models, and image processing, among many others. The main directions of technological advancement and associated challenges in photonics computing are explored, along with an assessment of its efficiency. Finally, the paper discusses prospects and the field of optical quantum computing, providing insights into the potential applications of this technology.Comment: Invited submission by Journal of Advanced Quantum Technologies; accepted version 5/06/202

arXiv.org e-Print Archive

The Performance of Associative Memory Models with Biologically Inspired Connectivity

Author: Chen W.
Publication venue
Publication date: 01/01/2009
Field of study

This thesis is concerned with one important question in artificial neural networks, that is, how biologically inspired connectivity of a network affects its associative memory performance. In recent years, research on the mammalian cerebral cortex, which has the main responsibility for the associative memory function in the brains, suggests that the connectivity of this cortical network is far from fully connected, which is commonly assumed in traditional associative memory models. It is found to be a sparse network with interesting connectivity characteristics such as the “small world network” characteristics, represented by short Mean Path Length, high Clustering Coefficient, and high Global and Local Efficiency. Most of the networks in this thesis are therefore sparsely connected. There is, however, no conclusive evidence of how these different connectivity characteristics affect the associative memory performance of a network. This thesis addresses this question using networks with different types of connectivity, which are inspired from biological evidences. The findings of this programme are unexpected and important. Results show that the performance of a non-spiking associative memory model is found to be predicted by its linear correlation with the Clustering Coefficient of the network, regardless of the detailed connectivity patterns. This is particularly important because the Clustering Coefficient is a static measure of one aspect of connectivity, whilst the associative memory performance reflects the result of a complex dynamic process. On the other hand, this research reveals that improvements in the performance of a network do not necessarily directly rely on an increase in the network’s wiring cost. Therefore it is possible to construct networks with high associative memory performance but relatively low wiring cost. Particularly, Gaussian distributed connectivity in a network is found to achieve the best performance with the lowest wiring cost, in all examined connectivity models. Our results from this programme also suggest that a modular network with an appropriate configuration of Gaussian distributed connectivity, both internal to each module and across modules, can perform nearly as well as the Gaussian distributed non-modular network. Finally, a comparison between non-spiking and spiking associative memory models suggests that in terms of associative memory performance, the implication of connectivity seems to transcend the details of the actual neural models, that is, whether they are spiking or non-spiking neurons

Distributed learning in sensor networks

Author: Gao Huaien
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 26/01/2009
Field of study

Digitale Hochschulschriften der LMU