Abstract-This paper presents a hardware implementation for high-speed, event-based data processing. A full-custom AddressEvent (AER) processing system (GAEP) features a 10ns-resolution 33M/5.125M events·s -1 peak/sustained event rate sensor data interface for precision time-stamping of asynchronous sensor data and implements hardware-accelerated event pre-processing including rate dependent IRQ generation and address masking for ROI/RONI. The pre-processing functions are implemented in dedicated hardware and operate without loading the actual processor device, a SPARCcompatible general-purpose micro-processor. The complete SoC is implemented in 0.18µm standard CMOS technology. We present a camera system comprising the AER processor and a bio-inspired dynamic vision sensor in an exemplary high-speed vision application related to shape detection / object recognition. Relevant details of the system architecture and performance results characterizing the vision system in a real-world machine vision application are presented.
I. INTRODUCTION
Conventional high-speed machine-vision systems are facing various data-rate/data-volume and processing complexity-related problems in applications like shape detection and object classification, object orientation extraction or measurement tasks, when working on standard gray-scale image data. Many of these applications would benefit from the precise but sparse information on the edges of fast moving objects as delivered by a temporal contrast dynamic vision sensor (DVS) [1] - [5] . These sensors use a pixel circuit that operates autonomously and responds with low latency to relative illumination changes. Typically mainly high-temporal resolution information on the contours of objects is recorded and transmitted to a processing device. The common notion, originating in the conventional frame-based style of imaging, that high temporal resolution automatically implies high data rates/volume is disproved by this type of vision device.
Shape detection, object classification and object orientation extraction on event-based DVS data have been presented previously. In these prior implementations the event processing algorithms were either executed on standard workstation computers or commercial microprocessor systems [6] - [10] , or asynchronous event processing was implemented on multiple distributed AER systems, e.g. [11] . The embedded system presented here [12] uses dedicated VLSI hardware for event-based sensor data processing and demonstrates highperformance machine vision in a compact and cost-effective implementation.
II. VISION SYSTEM The camera hardware is based on a biomimetic asynchronous time-based image sensor [4] and a full-custom SPARC-compatible address-event processor with hardwareaccelerated AER data interface and pre-processing [13] .
A. Image Sensor
Attempts to mimicking the Magno-cellular ('transient') pathway has recently been a line of activity in neuromorphic vision and has led to the development of the 'Dynamic Vision Sensor' (DVS) [1] - [5] . This type of visual sensor is sensitive to the dynamic information provided by the scene and disregards the sustained information, automatically suppressing static background. The sensor used in the presented application is a QVGA array DVS with time-based (PWM) imaging functionality [4] [5] . In the context of this vision application, only the DVS events are used and processed.
The ROI/RONI functionality of the sensor allows the selection of arbitrary regions of the array, enabling single or multiple line mode operation. The first (and also the last) line of the array has been displaced by half the pixel pitch with respect to the rest of the array. Selecting the first two pixel lines for line-mode operation yields a double-resolution (608 pixel) line. For the application of detection and classification of moving objects presented here, the sensor has been configured in double-resolution line mode.
B. Address-Event Processor -GAEP
The processing device of the embedded system is a general purpose SPARC-compatible processor based on a LEON3 core [14] . The GAEP SoC is equipped with a full-custom address-event data interface that conducts asynchronous data acquisition, synchronization and time-stamping along with advanced, hardware-accelerated pre-processing features without loading any processing tasks onto the actual processor core [13] . The main functional blocks of the GAEP sensor interface (SIF) [15] include data transfer and receipt acknowledgement, data rate measurement, data filtering, timestamp assignment and input data buffer management. The asynchronous AER bus is directly connected to the addressevent interface (AE IF), whereby the bus width is hardware limited to 20 bits. The transfer of the address event data from the SIF to the input data buffer is implemented as a DMA process and does not require any interaction from the processor core.
C. Data-Rate Measurement
In the "Data Rate Measurement" block of the SIF, the number of events during a configurable measurement interval TMI is constantly monitored and evaluated. Fig. 1 shows a block diagram of the unit. The circuit is based on two digital counters and data threshold comparators. Overrun and underrun thresholds are programmable separately. The number of events during each TMI slot is stored as the current data rate value DRAE and is used to control IRQ generation and the input data buffer. The processing unit has direct access to these data. The measurement interval TMI is derived from the system clock frequency f clk and is related to the time-stamp period TTS: The parameters TM factor and TS resolution are programmable in a range between 1 and (2 16 -1). I.e. given a system clock frequency f clk of 100MHz, the minimum time-stamp resolution evaluates to 10ns. The coupling of the time-stamp period TTS to the factor TM factor has the advantage of ensuring synchronicity between the measurement time duration and the time stamp thereby avoiding time skew.
In Fig. 2 the data rate-dependent system control feature is illustrated on the basis of a simple example. Part "A" of the figure shows a sketch of an object moving through the fieldof-view of a temporal-contrast AER vision sensor that responds to moving edges. The data representation corresponds to the temporal AE-recording of one line of sensor elements. The data rate DR AE is computed from the data stream as depicted in part "B" (zero, one and two events per measurement period respectively). By introducing an upper threshold value "Overrun Threshold" and a lower threshold value "Underrun Threshold" (e.g. 0.5 and 1) for the input data rate, the dynamic contents of the scene can be evaluated. In "C" the generated IRQs are symbolized by arrows, the action taken is noted above. Based on the sequence of IRQs, the processor is controlled. In the described application e.g. the sequence UR-IRQ (n * OR-IRQ) UR-IRQ starts the shape detection algorithm with a latency of only one TMI.
III. SETUP AND DATA ACQUSITION
For the test of the presented shape recognition concept an embedded smart camera [12] comprising the ATIS sensor and the GAEP was mounted above a spinning disc with attached high-contrast shapes printed on paper in different orientations. The size of the shapes was of the order of 2 to 3 cm and the sensors field of view was aligned to about twice this size. The disc was rotated at about 3 rotations per second, yielding a object speed of ca. 3.2 m·s -1 ; translated to the focal plane, the shapes were passing over one pixel line in about 10 ms.
In Fig. 3 , AE data acquired by the double-resolution pixel line of the ATIS sensor together with the stimulus shapes are shown. The second column in the figure contains the result of the temporal contrast detection as realized by the DVS pixels plotted as 2D AE data sets. Each dot in the figure represents an AE with its 'y' pixel coordinate on the vertical and its time stamp on the horizontal axis. 
IV. SHAPE DETECTION AND CLASSIFICATION
Based on the instantaneous data rate and controlled by the rate-dependent IRQ signals, simple shapes can be detected and classified directly in hardware with a minimum amount of processing done in the actual processor. In addition, a measure of the orientation of a known shape can be derived.
The shape detection is performed in principle in four different processing steps: a) Edge detection b) Object detection c) Projection of the edge data d) Classification by template matching
Step a) is performed in hardware by the DVS sensor pixel. In contrast to previous work [6] - [8] where shape detection based on TAE projection data has been shown, steps b) and c) are now also performed in hardware by the data-rate measurement unit in the SIF as described in section II.C. Fig. 3 shows the intermediate results of the individual processing steps. In the last column, the time depended data rate (compare Fig. 2 ) is shown. The histograms are calculated, based on the AE data (second column) by the SIF hardware. The time slots (histogram bins) can be adjusted (e.g. 100µs in this example). The histograms provide a characteristic 'footprint' of the respective shape and its orientation by projecting AE activity, containing the object`s edge information, along the x-axis -thereby realizing processing step c). Fig. 3 . Shapes, DVS acquired TAE data sets, prototypical templates and measured histograms of data rates. The SAD result for each of the test data sets is displayed as a vector. The correct SAD result for the shape is enclosed in brackets and the winning SAD is highlighted by bold print.
In particular the number of edges and their angle are contained in these 'histogram' data. The steeper the angle of the edge is with respect to the direction of object motion, the higher the instantaneous data rate. This is visualized in the figure by marking the sections of different data rates (letters A, B and C´) and showing an idealized 'template' histogram for each of the shapes and orientations. Each step in such a histogram data set represents a change in corresponding edge angles. For example in shape 1, which represents a square shape in a general orientation, we find sections A and C characterized by the relatively high data rates stemming from the leading and trailing edges of the square, and a section with lower data rate B, stemming from the side edges. The prototypical templates, corresponding to different shapes and orientations, have been generated manually from scratch and we will later see that they match the measured data well. However it has to be noted that this concept is not a rotationinvariant shape classification process because it recognizes features of shape and orientation simultaneously. Furthermore the histograms are not unique for all shapes in all possible orientations. A 45° rotated square and an isosceles triangle standing on its base for example feature a similar constant data rate profile and can therefore not be distinguished by the proposed method if only relative rates are considered.
The AE data rate histograms are utilized to implement a simple and fast shape classification algorithm.
Step d) of this process, the matching of the measured data rate histogram with the templates is implemented in software and runs on the embedded GAEP processor. This processing step starts by scaling the measured data rate histogram in magnitude and time to the prototype template size, which are 128 bins wide and have a mean value of about 100 events per bin. In a second step the sum of absolute differences (SAD) between scaled histogram and template are calculated point wise for each template. The minimum of the resulting SAD values wins and determines the detected shape.
Information about the orientation of a known shape can be derived from the ratios of the different sections (A, B, C, …) in the AE data rate histograms (e.g. shapes 1 and 2 in Fig. 3 ). However this algorithm has not yet been implemented in the GAEP processor and no results are being reported here. Fig. 3 shows the typical data sets and SAD matching results for the 6 templates. SAD results are given as vectors with the correct result enclosed in brackets and the winning SAD highlighted by bold print. It is shown that correct SAD results are achieved for all shapes. The difference of the winning SAD to the next nearest match can be taken as a measure for the confidence of the decision, whereas the absolute value of the winning SAD is a measure for matching quality. Winning SAD values above 10000 for example should not be considered as a correct match at all. The quality of the match is also influenced by a correct rescaling of the data.
V. RESULTS AND DISCUSSION
The shape detection algorithm implementing processing step d) is software-coded and runs on the GAEP Leon3 processor core. In the current (non-optimized) software implementation, a shape detection process is completed within about 4ms. Owing to the described hardware acceleration, processing steps a) -c) run concurrently with step d) and do not load the processor core, yielding a processing/detection rate of about 250 shapes per second.
VI. CONCLUSION An embedded vision device based on a biomimetic eventdriven dynamic vision sensor and a full-custom address-event processor SoC achieves very competitive performance figures by implementing simple vision tasks in dedicated hardware. Shape detection and classification has been demonstrated at hundreds of objects per second on a compact, low-power embedded system. Orientation extraction for known shapes can be achieved at similar speeds. Target application areas are e.g. high-speed/high temporal-resolution machine vision for industrial applications and robotics.
VII. ACKNOWLEDGEMENT This work was supported by the "eMorph" project, sponsored by the European Commission under Grant Agreement No. 231467.
