42 research outputs found
Spike Events Processing for Vision Systems
In this paper we briefly summarize the fundamental
properties of spike events processing applied to artificial
vision systems. This sensing and processing technology
is capable of very high speed throughput, because it
does not rely on sensing and processing sequences of
frames, and because it allows for complex hierarchically
structured cortical-like layers for sophisticated
processing. The paper includes a few examples that have
demonstrated the potential of this technology for highspeed
vision processing, such as a multilayer event
processing network of 5 sequential cortical-like layers,
and a recognition system capable of discriminating
propellers of different shape rotating at 5000 revolutions
per second (300000 revolutions per minute)
A bioinspired 128x128 pixel dynamic-vision-sensor
This paper presents a 128x128 dynamic vision
sensor. Each pixel detects temporal changes in the local
illumination. A minimum illumination temporal contrast of
10% can be detected. A compact preamplification stage has been
introduced that allows to improve the minimum detectable
contrast over previous designs, while at the same time reducing
the pixel area by 1/3. The pixel responds to illumination changes
in less than 3.6μs. The ability of the sensor to capture very fast
moving objects has been verified experimentally. A frame-based
sensor capable to achieve this, would require at least 100K
frames per second.Unión Europea FP7-ICT-2007-1-216777Ministerio de Educación y Ciencia TEC2006-11730-C03-01 (SAMANTA2)Ministerio de Educación y Ciencia TEC2009-10639- C04-01 (VULCANO)Junta de Andalucía P06- TIC-1417 (Brain System
Programmable 2D image filter for AER vision processing
A VLSI architecture is proposed for the realization of real-time 2D image filtering in an address-event-representation (AER) vision system, The architecture is capable of implementing any convolutional kernel F(x, y) as long as it is decomposable into x-axis and y-axis components, i.e, F(x, y)=H(x)V(y), for some rotated coordinate system {x, y}, and if this product can be approximated safely by a signed minimum operation. The proposed architecture is intended to be used in a complete vision system, known as the boundary-contour-system and feature-contour-system (BCS-FCS) vision model
FPGA Implementation of An Event-driven Saliency-based Selective Attention Model
Artificial vision systems of autonomous agents face very difficult
challenges, as their vision sensors are required to transmit vast amounts of
information to the processing stages, and to process it in real-time. One first
approach to reduce data transmission is to use event-based vision sensors,
whose pixels produce events only when there are changes in the input. However,
even for event-based vision, transmission and processing of visual data can be
quite onerous. Currently, these challenges are solved by using high-speed
communication links and powerful machine vision processing hardware. But if
resources are limited, instead of processing all the sensory information in
parallel, an effective strategy is to divide the visual field into several
small sub-regions, choose the region of highest saliency, process it, and shift
serially the focus of attention to regions of decreasing saliency. This
strategy, commonly used also by the visual system of many animals, is typically
referred to as ``selective attention''. Here we present a digital architecture
implementing a saliency-based selective visual attention model for processing
asynchronous event-based sensory information received from a DVS. For ease of
prototyping, we use a standard digital design flow and map the architecture on
an FPGA. We describe the architecture block diagram highlighting the efficient
use of the available hardware resources demonstrated through experimental
results exploiting a hardware setup where the FPGA interfaced with the DVS
camera.Comment: 5 pages, 5 figure
Yak: An Asynchronous Bundled Data Pipeline Description Language
The design of asynchronous circuits typically requires a judicious definition
of signals and modules, combined with a proper specification of their timing
constraints, which can be a complex and error-prone process, using standard
Hardware Description Languages (HDLs). In this paper we introduce Yak, a new
dataflow description language for asynchronous bundled data circuits. Yak
allows designers to generate Verilog and timing constraints automatically, from
a textual description of bundled data control flow structures and combinational
logic blocks. The timing constraints are generated using the Local Clock Set
methodology and can be consumed by standard industry tools. Yak includes
ergonomic language features such as structured bindings of channels undergoing
fork and join operations, named value scope propagation along channels, and
channel typing. Here we present Yak's language front-end and compare the
automated synthesis and layout results of an example circuit with a manual
constraint specification approach
Real-time cortical simulations: energy and interconnect scaling on distributed systems
We profile the impact of computation and inter-processor communication on the
energy consumption and on the scaling of cortical simulations approaching the
real-time regime on distributed computing platforms. Also, the speed and energy
consumption of processor architectures typical of standard HPC and embedded
platforms are compared. We demonstrate the importance of the design of
low-latency interconnect for speed and energy consumption. The cost of cortical
simulations is quantified using the Joule per synaptic event metric on both
architectures. Reaching efficient real-time on large scale cortical simulations
is of increasing relevance for both future bio-inspired artificial intelligence
applications and for understanding the cognitive functions of the brain, a
scientific quest that will require to embed large scale simulations into highly
complex virtual or real worlds. This work stands at the crossroads between the
WaveScalES experiment in the Human Brain Project (HBP), which includes the
objective of large scale thalamo-cortical simulations of brain states and their
transitions, and the ExaNeSt and EuroExa projects, that investigate the design
of an ARM-based, low-power High Performance Computing (HPC) architecture with a
dedicated interconnect scalable to million of cores; simulation of deep sleep
Slow Wave Activity (SWA) and Asynchronous aWake (AW) regimes expressed by
thalamo-cortical models are among their benchmarks.Comment: 8 pages, 8 figures, 4 tables, submitted after final publication on
PDP2019 proceedings, corrected final DOI. arXiv admin note: text overlap with
arXiv:1812.04974, arXiv:1804.0344
Gaussian and exponential lateral connectivity on distributed spiking neural network simulation
We measured the impact of long-range exponentially decaying intra-areal
lateral connectivity on the scaling and memory occupation of a distributed
spiking neural network simulator compared to that of short-range Gaussian
decays. While previous studies adopted short-range connectivity, recent
experimental neurosciences studies are pointing out the role of longer-range
intra-areal connectivity with implications on neural simulation platforms.
Two-dimensional grids of cortical columns composed by up to 11 M point-like
spiking neurons with spike frequency adaption were connected by up to 30 G
synapses using short- and long-range connectivity models. The MPI processes
composing the distributed simulator were run on up to 1024 hardware cores,
hosted on a 64 nodes server platform. The hardware platform was a cluster of
IBM NX360 M5 16-core compute nodes, each one containing two Intel Xeon Haswell
8-core E5-2630 v3 processors, with a clock of 2.40 G Hz, interconnected through
an InfiniBand network, equipped with 4x QDR switches.Comment: 9 pages, 9 figures, added reference to final peer reviewed version on
conference paper and DO
A spatial contrast retina with on-chip calibration for neuromorphic spike-based AER vision systems
We present a 32 32 pixels contrast retina microchip that provides its output as an address event representation (AER) stream. Spatial contrast is computed as the ratio between pixel photocurrent and a local average between neighboring pixels obtained with a diffuser network. This current-based computation produces an important amount of mismatch between neighboring pixels, because the currents can be as low as a few pico-amperes. Consequently, a compact calibration circuitry has been included to trimm each pixel. Measurements show a reduction in mismatch standard deviation from 57% to 6.6% (indoor light). The paper describes the design of the pixel with its spatial contrast computation and calibration sections. About one third of pixel area is used for a 5-bit calibration circuit. Area of pixel is 58 m 56 m, while its current consumption is about 20 nA at 1-kHz event rate. Extensive experimental results are provided for a prototype fabricated in a standard 0.35- m CMOS process.Gobierno de España TIC2003-08164-C03-01, TEC2006-11730-C03-01European Union IST-2001-3412
Scaling of a large-scale simulation of synchronous slow-wave and asynchronous awake-like activity of a cortical model with long-range interconnections
Cortical synapse organization supports a range of dynamic states on multiple
spatial and temporal scales, from synchronous slow wave activity (SWA),
characteristic of deep sleep or anesthesia, to fluctuating, asynchronous
activity during wakefulness (AW). Such dynamic diversity poses a challenge for
producing efficient large-scale simulations that embody realistic metaphors of
short- and long-range synaptic connectivity. In fact, during SWA and AW
different spatial extents of the cortical tissue are active in a given timespan
and at different firing rates, which implies a wide variety of loads of local
computation and communication. A balanced evaluation of simulation performance
and robustness should therefore include tests of a variety of cortical dynamic
states. Here, we demonstrate performance scaling of our proprietary Distributed
and Plastic Spiking Neural Networks (DPSNN) simulation engine in both SWA and
AW for bidimensional grids of neural populations, which reflects the modular
organization of the cortex. We explored networks up to 192x192 modules, each
composed of 1250 integrate-and-fire neurons with spike-frequency adaptation,
and exponentially decaying inter-modular synaptic connectivity with varying
spatial decay constant. For the largest networks the total number of synapses
was over 70 billion. The execution platform included up to 64 dual-socket
nodes, each socket mounting 8 Intel Xeon Haswell processor cores @ 2.40GHz
clock rates. Network initialization time, memory usage, and execution time
showed good scaling performances from 1 to 1024 processes, implemented using
the standard Message Passing Interface (MPI) protocol. We achieved simulation
speeds of between 2.3x10^9 and 4.1x10^9 synaptic events per second for both
cortical states in the explored range of inter-modular interconnections.Comment: 22 pages, 9 figures, 4 table