Search CORE

52,864 research outputs found

Reaction–diffusion chemistry implementation of associative memory neural network

Author: Adamatzky A.
Belousov B.P.
Conrad M.
Gentili P.L.
Haykin S.
James Stovold
Rambidi N.
Simon O’Keefe
Stepney S.
Stepney S.
Zhabotinsky A.M.
Zhabotinsky A.M.
Publication venue: 'Informa UK Limited'
Publication date: 07/03/2016
Field of study

Unconventional computing paradigms are typically very difficult to program. By implementing efficient parallel control architectures such as artificial neural networks, we show that it is possible to program unconventional paradigms with relative ease. The work presented implements correlation matrix memories (a form of artificial neural network based on associative memory) in reaction–diffusion chemistry, and shows that implementations of such artificial neural networks can be trained and act in a similar way to conventional implementations

Crossref

Cronfa at Swansea University

Lancaster E-Prints

White Rose Research Online

DAMNED: A Distributed and Multithreaded Neural Event-Driven simulation framework

Author: Mouraud Anthony
Paugam-Moisy Hélène
Puzenat Didier
Publication venue
Publication date: 01/01/2006
Field of study

In a Spiking Neural Networks (SNN), spike emissions are sparsely and irregularly distributed both in time and in the network architecture. Since a current feature of SNNs is a low average activity, efficient implementations of SNNs are usually based on an Event-Driven Simulation (EDS). On the other hand, simulations of large scale neural networks can take advantage of distributing the neurons on a set of processors (either workstation cluster or parallel computer). This article presents DAMNED, a large scale SNN simulation framework able to gather the benefits of EDS and parallel computing. Two levels of parallelism are combined: Distributed mapping of the neural topology, at the network level, and local multithreaded allocation of resources for simultaneous processing of events, at the neuron level. Based on the causality of events, a distributed solution is proposed for solving the complex problem of scheduling without synchronization barrier.Comment: 6 page

arXiv.org e-Print Archive

CiteSeerX

HAL Descartes

Hal-Diderot

Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL

Author: Azghadi Mostafa Rahimi
Lammie Corey
Xiang Wei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Recent technological advances have proliferated the available computing power, memory, and speed of modern Central Processing Units (CPUs), Graphics Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs). Consequently, the performance and complexity of Artificial Neural Networks (ANNs) is burgeoning. While GPU accelerated Deep Neural Networks (DNNs) currently offer state-of-the-art performance, they consume large amounts of power. Training such networks on CPUs is inefficient, as data throughput and parallel computation is limited. FPGAs are considered a suitable candidate for performance critical, low power systems, e.g. the Internet of Things (IOT) edge devices. Using the Xilinx SDAccel or Intel FPGA SDK for OpenCL development environment, networks described using the high-level OpenCL framework can be accelerated on heterogeneous platforms. Moreover, the resource utilization and power consumption of DNNs can be further enhanced by utilizing regularization techniques that binarize network weights. In this paper, we introduce, to the best of our knowledge, the first FPGA-accelerated stochastically binarized DNN implementations, and compare them to implementations accelerated using both GPUs and FPGAs. Our developed networks are trained and benchmarked using the popular MNIST and CIFAR-10 datasets, and achieve near state-of-the-art performance, while offering a >16-fold improvement in power consumption, compared to conventional GPU-accelerated networks. Both our FPGA-accelerated determinsitic and stochastic BNNs reduce inference times on MNIST and CIFAR-10 by >9.89x and >9.91x, respectively.Comment: 4 pages, 3 figures, 1 tabl

arXiv.org e-Print Archive

ResearchOnline@JCU

Crossref

ResearchOnline at James Cook University

Memristor Neural Network Design

Author: Chi Yu
Huang Anping
Li Runmiao
Zhang Xinjiang
Publication venue: 'IntechOpen'
Publication date: 20/12/2017
Field of study

Neural network, a powerful learning model, has archived amazing results. However, the current Von Neumann computing system–based implementations of neural networks are suffering from memory wall and communication bottleneck problems ascribing to the Complementary Metal Oxide Semiconductor (CMOS) technology scaling down and communication gap. Memristor, a two terminal nanosolid state nonvolatile resistive switching, can provide energy‐efficient neuromorphic computing with its synaptic behavior. Crossbar architecture can be used to perform neural computations because of its high density and parallel computation. Thus, neural networks based on memristor crossbar will perform better in real world applications. In this chapter, the design of different neural network architectures based on memristor is introduced, including spiking neural networks, multilayer neural networks, convolution neural networks, and recurrent neural networks. And the brief introduction, the architecture, the computing circuits, and the training algorithm of each kind of neural networks are presented by instances. The potential applications and the prospects of memristor‐based neural network system are discussed

IntechOpen

Crossref

Single stream parallelization of generalized LSTM-like RNNs on a GPU

Author: Hwang Kyuyeon
Sung Wonyong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/03/2015
Field of study

Recurrent neural networks (RNNs) have shown outstanding performance on processing sequence data. However, they suffer from long training time, which demands parallel implementations of the training procedure. Parallelization of the training algorithms for RNNs are very challenging because internal recurrent paths form dependencies between two different time frames. In this paper, we first propose a generalized graph-based RNN structure that covers the most popular long short-term memory (LSTM) network. Then, we present a parallelization approach that automatically explores parallelisms of arbitrary RNNs by analyzing the graph structure. The experimental results show that the proposed approach shows great speed-up even with a single training stream, and further accelerates the training when combined with multiple parallel training streams.Comment: Accepted by the 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 201

arXiv.org e-Print Archive

Crossref

Improving the Expressiveness of Deep Learning Frameworks with Recursion

Author: Chun Byung-Gon
Jeong Eunji
Jeong Joo Seong
Kim Soojeong
Yu Gyeong-In
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/09/2018
Field of study

Recursive neural networks have widely been used by researchers to handle applications with recursively or hierarchically structured data. However, embedded control flow deep learning frameworks such as TensorFlow, Theano, Caffe2, and MXNet fail to efficiently represent and execute such neural networks, due to lack of support for recursion. In this paper, we add recursion to the programming model of existing frameworks by complementing their design with recursive execution of dataflow graphs as well as additional APIs for recursive definitions. Unlike iterative implementations, which can only understand the topological index of each node in recursive data structures, our recursive implementation is able to exploit the recursive relationships between nodes for efficient execution based on parallel computation. We present an implementation on TensorFlow and evaluation results with various recursive neural network models, showing that our recursive implementation not only conveys the recursive nature of recursive neural networks better than other implementations, but also uses given resources more effectively to reduce training and inference time.Comment: Appeared in EuroSys 2018. 13 pages, 11 figure

arXiv.org e-Print Archive

Crossref

Parallel and pseudorandom discrete event system specification vs. networks of spiking neurons: Formalization and preliminary implementation results

Author: Dao van Toan
Grammont Franck
Hill David R.C.
Lerasle Matthieu
Muzy Alexandre
Publication venue: HAL CCSD
Publication date: 18/07/2016
Field of study

International audienceUsual Parallel Discrete Event System Specification (P-DEVS) allows specifying systems from modeling to simulation. However, the framework does not incorporate parallel and stochastic simulations. This work intends to extend P-DEVS to parallel simulations and pseudorandom number generators in the context of a spiking neural network. The discrete event specification presented here makes explicit and centralized the parallel computation of events as well as their routing, making further implementations more easy. It is then expected to dispose of a well defined mathematical and computational framework to deal with networks of spiking neurons

HAL-UNICE

HAL Clermont Université

Photonic reservoir computing: a new approach to optical information processing

Author: Bienstman Peter
Dambre Joni
Fiers Martin
Schrauwen Benjamin
Vandoorne Kristof
Verstraeten David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Despite ever increasing computational power, recognition and classification problems remain challenging to solve. Recently advances have been made by the introduction of the new concept of reservoir computing. This is a methodology coming from the field of machine learning and neural networks and has been successfully used in several pattern classification problems, like speech and image recognition. The implementations have so far been in software, limiting their speed and power efficiency. Photonics could be an excellent platform for a hardware implementation of this concept because of its inherent parallelism and unique nonlinear behaviour. We propose using a network of coupled Semiconductor Optical Amplifiers (SOA) and show in simulation that it could be used as a reservoir by comparing it on a benchmark speech recognition task to conventional software implementations. In spite of several differences, they perform as good as or better than conventional implementations. Moreover, a photonic implementation offers the promise of massively parallel information processing with low power and high speed. We will also address the role phase plays on the reservoir performance

Crossref

Ghent University Academic Bibliography