Abstract-Image processing for intelligent cameras like those as the one presented in this paper is to provide very fast used in video surveillance applications implies computational and efficient dynamic reconfiguration, allowing the user to demanding algorithms activated in function of non predictable extensively exploit time multiplexing over a given set of silicon events, such as the content of the image or user requests. For extesi eoitetime m exin overagie setsofs such applications, hardwired acceleration must be restricted to resources. For intellgent cameras reconfigurable processors a minimum subset of kernels, due to the increasing NREs when provide an appealing alternative. They allow intensive runapplication update become necessary. Embedded reconfigurable time re-use of the acceleration logic, properly configurated to processors, coupling in the same computing engine a general-the specific (and possibly event-driven) required task. purpose embedded processor and field-programmable fabrics,
such applications, hardwired acceleration must be restricted to resources. For intellgent cameras reconfigurable processors a minimum subset of kernels, due to the increasing NREs when provide an appealing alternative. They allow intensive runapplication update become necessary. Embedded reconfigurable time re-use of the acceleration logic, properly configurated to processors, coupling in the same computing engine a general-the specific (and possibly event-driven) required task. purpose embedded processor and field-programmable fabrics,
In this paper, a motion detection algorithm requiring more
provide an appealing trade-off point between pure software and dedicated hardware acceleration. As a case-study, this than 10 different basic operation kernelis rconsidered as a paper presents the implementation of a set of image processing case-study in order to show the possibilities offered by reconoperators utilized for motion detection on the DREAM adaptive figurable computing, and to analyze the different design trade-DSP. With respect to pure software solutions, the proposed imple-offs. A key point of this work is that impressive performance mentation achieves a performance improvement of 2-3 orders of improvements with respect to a software solution (2-3 order magnitude, while retaining the same degree of programmability ofmagnitude) were achieved after (roughly) 1 month of work and the same economical perspectives from the end-user point mostly performed by an unexperienced MSc student. Figure 1 , which highlight hardware acceleration. In this case, the operator would then be the three basic concepts of DREAM: often inactive while occupying a large share of the technology . a reconfigurable datapath charged to speed-up the comresource. A relevant feature of reconfigurable processor such putationally intensive parts of the application. . a programmable high-bandwidth memory architecThe work presented in this paper is done within the MORPHEUS project (IST FP6, project on the main processor of the SoC. the intelligent camera. Even if this differentiation allows to In the case of DREAM, the reconfigurable datapath is isolate the object under analysis (if any), too many details implemented by the Pipelined Configurable Gate Array are present as the complete grayscale. For that, binarization is (PiCoGA), a 4-context array of 16x24 reconfigurable logic appliedto the frame. Given athreshold conventionally fixedto cells (RLC). The array performs computation according to a 0.3 x the maximum value of the pixels, binarization returns an dataflow paradigm. It is programmed utilizing a simplified C-TopValue if the current pixel is greater than the threshold, and syntax termed Griffy-C [3] that is utilized to describe Data a BottomValue otherwise. The resulting image could still be Flow Graphs hiding to the user technology and implementation affected from noise (spurious pixels) or on the contrary could details. The aim of this design choice is to allows software be affected by some hole. This "cleaning" task is acomplished programmers to develop application on the device without by the opening phase, implemented by two operators: specific hardware design skills. 
in the test-case here presented. i-i j-1
III. MOTION DETECTION ALGORITHM
where E(x, y) is the pixel under elaboration, p(h, k) represents the pixel in the 3x3 matrix centered in (x, y), and K The Motion detection algorithms provides the capability to is the Sobel matrix, for horizontal and vertical edge detection, detect a human, a vehicle or an object in movement with defined as: respect to a static background. This could be useful for The gradient is then computed approximating the combinaabsolute pixel-to-pixel dufference between the current frame tion of the two components, as in the following formula: The implementation or more in general the acceleration of Since each pixel can be represented by 1 bit, the result the above described application on the reconfigurable platform, of horizontal and vertical Sobel convolution is in the range is driven by two main factors:
[ high throughput we need to be able to properly pipeline * the erosion phase requires the search of the minimum computation on PiCoGA. We concurrently compute on three within the pixels in a 3x3 matrix, but is implemented on different rows at a time for all the computations, since most DREAM by a single 9-bit input 1-bit output AND; of the operations requires pixels from 3 adjacent rows: * the dilatation phase requires the search of the maximum . erosion, dilatation and edge detection read data from 3 within the pixels in a 3x3 matrix, but is implemented on adjacent rows and provide the result for the row in the DREAM by a singel 9-bit input 1-bit output OR;
middle. In this case, since each pixel is represented by * edge detection can be strongly simplified as explained in 
