Search CORE

19,036 research outputs found

MorphIC: A 65-nm 738k-Synapse/mm $^2$ Quad-Core Binary-Weight Digital Neuromorphic Processor with Stochastic Spike-Driven Online Learning

Author: Bol David
Frenkel Charlotte
Legat Jean-Didier
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Recent trends in the field of neural network accelerators investigate weight quantization as a means to increase the resource- and power-efficiency of hardware devices. As full on-chip weight storage is necessary to avoid the high energy cost of off-chip memory accesses, memory reduction requirements for weight storage pushed toward the use of binary weights, which were demonstrated to have a limited accuracy reduction on many applications when quantization-aware training techniques are used. In parallel, spiking neural network (SNN) architectures are explored to further reduce power when processing sparse event-based data streams, while on-chip spike-based online learning appears as a key feature for applications constrained in power and resources during the training phase. However, designing power- and area-efficient spiking neural networks still requires the development of specific techniques in order to leverage on-chip online learning on binary weights without compromising the synapse density. In this work, we demonstrate MorphIC, a quad-core binary-weight digital neuromorphic processor embedding a stochastic version of the spike-driven synaptic plasticity (S-SDSP) learning rule and a hierarchical routing fabric for large-scale chip interconnection. The MorphIC SNN processor embeds a total of 2k leaky integrate-and-fire (LIF) neurons and more than two million plastic synapses for an active silicon area of 2.86mm

^2

in 65nm CMOS, achieving a high density of 738k synapses/mm

^2

. MorphIC demonstrates an order-of-magnitude improvement in the area-accuracy tradeoff on the MNIST classification task compared to previously-proposed SNNs, while having no penalty in the energy-accuracy tradeoff.Comment: This document is the paper as accepted for publication in the IEEE Transactions on Biomedical Circuits and Systems journal (2019), the fully-edited paper is available at https://ieeexplore.ieee.org/document/876400

arXiv.org e-Print Archive

DIAL UCLouvain

GPU-Accelerated Algorithms for Compressed Signals Recovery with Application to Astronomical Imagery Deblurring

Author: Fiandrotti Attilio
Fosson Sophie M.
Magli Enrico
Ravazzi Chiara
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2017
Field of study

Compressive sensing promises to enable bandwidth-efficient on-board compression of astronomical data by lifting the encoding complexity from the source to the receiver. The signal is recovered off-line, exploiting GPUs parallel computation capabilities to speedup the reconstruction process. However, inherent GPU hardware constraints limit the size of the recoverable signal and the speedup practically achievable. In this work, we design parallel algorithms that exploit the properties of circulant matrices for efficient GPU-accelerated sparse signals recovery. Our approach reduces the memory requirements, allowing us to recover very large signals with limited memory. In addition, it achieves a tenfold signal recovery speedup thanks to ad-hoc parallelization of matrix-vector multiplications and matrix inversions. Finally, we practically demonstrate our algorithms in a typical application of circulant matrices: deblurring a sparse astronomical image in the compressed domain

arXiv.org e-Print Archive

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Institutional Research Information System University of Turin

PORTO Publications Open Repository TOrino

A multi-scale, multi-wavelength source extraction method: getsources

Author: A. Men’shchikov
Alves
André
Arzoumanian
Bertin
Bontemps
di Francesco
Elmegreen
F. Motte
Falgarone
Gong
Goodman
Griffin
Hennemann
Johnstone
Kainulainen
Kennicutt
Könyves
Lagache
Larson
M. Hennemann
Maury
Men’shchikov
Miville-Deschênes
Moffat
Molinari
Motte
Motte
Motte
Motte
Motte
Myers
N. Schneider
P. Didelon
Ph. André
Pilbratt
Poglitsch
Rosolowsky
Roy
Schneider
Schneider
Stutzki
Williams
Publication venue: 'EDP Sciences'
Publication date: 19/04/2012
Field of study

We present a multi-scale, multi-wavelength source extraction algorithm called getsources. Although it has been designed primarily for use in the far-infrared surveys of Galactic star-forming regions with Herschel, the method can be applied to many other astronomical images. Instead of the traditional approach of extracting sources in the observed images, the new method analyzes fine spatial decompositions of original images across a wide range of scales and across all wavebands. It cleans those single-scale images of noise and background, and constructs wavelength-independent single-scale detection images that preserve information in both spatial and wavelength dimensions. Sources are detected in the combined detection images by following the evolution of their segmentation masks across all spatial scales. Measurements of the source properties are done in the original background-subtracted images at each wavelength; the background is estimated by interpolation under the source footprints and overlapping sources are deblended in an iterative procedure. In addition to the main catalog of sources, various catalogs and images are produced that aid scientific exploitation of the extraction results. We illustrate the performance of getsources on Herschel images by extracting sources in sub-fields of the Aquila and Rosette star-forming regions. The source extraction code and validation images with a reference extraction catalog are freely available.Comment: 31 pages, 27 figures, to be published in Astronomy & Astrophysic

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

GPUs as Storage System Accelerators

Author: Al-Kiswany Samer
Gharaibeh Abdullah
Ripeanu Matei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/05/2012
Field of study

Massively multicore processors, such as Graphics Processing Units (GPUs), provide, at a comparable price, a one order of magnitude higher peak performance than traditional CPUs. This drop in the cost of computation, as any order-of-magnitude drop in the cost per unit of performance for a class of system components, triggers the opportunity to redesign systems and to explore new ways to engineer them to recalibrate the cost-to-performance relation. This project explores the feasibility of harnessing GPUs' computational power to improve the performance, reliability, or security of distributed storage systems. In this context, we present the design of a storage system prototype that uses GPU offloading to accelerate a number of computationally intensive primitives based on hashing, and introduce techniques to efficiently leverage the processing power of GPUs. We evaluate the performance of this prototype under two configurations: as a content addressable storage system that facilitates online similarity detection between successive versions of the same file and as a traditional system that uses hashing to preserve data integrity. Further, we evaluate the impact of offloading to the GPU on competing applications' performance. Our results show that this technique can bring tangible performance gains without negatively impacting the performance of concurrently running applications.Comment: IEEE Transactions on Parallel and Distributed Systems, 201

arXiv.org e-Print Archive

Crossref

HIDE+: a logic based hardware development environment

Author: Benkrid Abdsamad
Benkrid K.
Publication venue
Publication date: 01/01/2008
Field of study

Portsmouth University Research Portal (Pure)

Programmability and Performance of Parallel ECS-based Simulation of Multi-Agent Exploration Models

Author: B.R. Preiss
D. Cucuzzo
D. Fox
F. Quaglia
G. Cordasco
K. Popov
M.Y.H. Low
P. Richmond
R.M. Fujimoto
T. Takahashi
W. Marurngsith
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

While the traditional objective of parallel/distributed simulation techniques has been mainly in improving performance and making very large models tractable, more recent research trends targeted complementary aspects, such as the “ease of programming”. Along this line, a recent proposal called Event and Cross State (ECS) synchronization, stands as a solution allowing to break the traditional programming rules proper of Parallel Discrete Event Simulation (PDES) systems, where the application code processing a specific event is only allowed to access the state (namely the memory image) of the target simulation object. In fact with ECS, the programmer is allowed to write ANSI-C event-handlers capable of accessing (in either read or write mode) the state of whichever simulation object included in the simulation model. Correct concurrent execution of events, e.g., on top of multi-core machines, is guaranteed by ECS with no intervention by the programmer, who is in practice exposed to a sequential-style programming model where events are processed one at a time, and have the ability to access the current memory image of the whole simulation model, namely the collection of the states of any involved object. This can strongly simplify the development of specific models, e.g., by avoiding the need for passing state information across concurrent objects in the form of events. In this article we investigate on both programmability and performance aspects related to developing/supporting a multi-agent exploration model on top of the ROOT-Sim PDES platform, which supports ECS

Crossref

ART

Archivio della ricerca- Università di Roma La Sapienza