3,460 research outputs found
High volume colour image processing with massively parallel embedded processors
Currently Oc´e uses FPGA technology for implementing colour image processing for their high volume colour printers. Although FPGA technology provides enough performance it, however, has a rather tedious development process. This paper describes the research conducted on an alternative implementation technology: software defined massively parallel processing. It is shown that this technology not only leads to a reduction in development time but also adds flexibility to the design
FPGA applications in signal and image processing
The increasing demand for real-time and smart digital signal processing (DSP) systems, calls for a better platform for their implementation. Most of these systems (e.g. digital image processing) are highly parallelisable, memory and processor hungry; such that the increasing performance of today�s general-purpose microprocessors are no longer able to handle them. A highly parallel hardware architecture, which offers enough memory resources, offers an alternative for such DSP implementations
A fully parameterized virtual coarse grained reconfigurable array for high performance computing applications
Field Programmable Gate Arrays (FPGAs) have proven their potential in accelerating High Performance Computing (HPC) Applications. Conventionally such accelerators predominantly use, FPGAs that contain fine-grained elements such as LookUp Tables (LUTs), Switch Blocks (SB) and Connection Blocks (CB) as basic programmable logic blocks.
However, the conventional implementation suffers from high reconfiguration and development costs. In order to solve this problem, programmable logic components are defined at a virtual higher abstraction level. These components are called Processing Elements (PEs) and the group of PEs along with the inter-connection network form an architecture called a Virtual Coarse-Grained Reconfigurable Array (VCGRA). The abstraction helps to reconfigure the PEs faster at the intermediate level than at the lower-level of an FPGA.
Conventional VCGRA implementations (built on top of the lower levels of the FPGA) use functional resources such as LUTs to establish required connections (intra-connect) within a PE. In this paper, we propose to use the parameterized reconfiguration technique to implement the intra-connections of each PE with the aim to reduce the FPGA resource utilization (LUTs). The technique is used to parameterize the intra-connections with parameters that only change their value infrequently (whenever a new VCGRA function has to be reconfigured) and that are implemented as constants. Since the design is optimized for these constants at every moment in time, this reduces the resource utilization. Further, interconnections (network between the multiple PEs) of the VCGRA grid can also be parameterized so that both the inter- and intraconnect network of the VCGRA grid can be mapped onto the physical switch blocks of the FPGA. For every change in parameter values a specialized bitstream is generated on the fly and the FPGA is reconfigured using the parameterized run-time reconfiguration technique. Our results show a drastic reduction in FPGA LUT resource utilization in the PE by at least 30% and in the intra-network of the PE by 31% when implementing an HPC application
Deep Learning-Based Multiple Object Visual Tracking on Embedded System for IoT and Mobile Edge Computing Applications
Compute and memory demands of state-of-the-art deep learning methods are
still a shortcoming that must be addressed to make them useful at IoT
end-nodes. In particular, recent results depict a hopeful prospect for image
processing using Convolutional Neural Netwoks, CNNs, but the gap between
software and hardware implementations is already considerable for IoT and
mobile edge computing applications due to their high power consumption. This
proposal performs low-power and real time deep learning-based multiple object
visual tracking implemented on an NVIDIA Jetson TX2 development kit. It
includes a camera and wireless connection capability and it is battery powered
for mobile and outdoor applications. A collection of representative sequences
captured with the on-board camera, dETRUSC video dataset, is used to exemplify
the performance of the proposed algorithm and to facilitate benchmarking. The
results in terms of power consumption and frame rate demonstrate the
feasibility of deep learning algorithms on embedded platforms although more
effort to joint algorithm and hardware design of CNNs is needed.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
A single-chip FPGA implementation of real-time adaptive background model
This paper demonstrates the use of a single-chip
FPGA for the extraction of highly accurate background
models in real-time. The models are based
on 24-bit RGB values and 8-bit grayscale intensity
values. Three background models are presented, all
using a camcorder, single FPGA chip, four blocks
of RAM and a display unit. The architectures have
been implemented and tested using a Panasonic NVDS60B
digital video camera connected to a Celoxica
RC300 Prototyping Platform with a Xilinx Virtex
II XC2v6000 FPGA and 4 banks of onboard RAM.
The novel FPGA architecture presented has the advantages
of minimizing latency and the movement of
large datasets, by conducting time critical processes
on BlockRAM. The systems operate at clock rates
ranging from 57MHz to 65MHz and are capable
of performing pre-processing functions like temporal
low-pass filtering on standard frame size of 640X480
pixels at up to 210 frames per second
Real-time human action recognition on an embedded, reconfigurable video processing architecture
Copyright @ 2008 Springer-Verlag.In recent years, automatic human motion recognition has been widely researched within the computer vision and image processing communities. Here we propose a real-time embedded vision solution for human motion recognition implemented on a ubiquitous device. There are three main contributions in this paper. Firstly, we have developed a fast human motion recognition system with simple motion features and a linear Support Vector Machine (SVM) classifier. The method has been tested on a large, public human action dataset and achieved competitive performance for the temporal template (eg. “motion history image”) class of approaches. Secondly, we have developed a reconfigurable, FPGA based video processing architecture. One advantage of this architecture is that the system processing performance can be reconfiured for a particular application, with the addition of new or replicated processing cores. Finally, we have successfully implemented a human motion recognition system on this reconfigurable architecture. With a small number of human actions (hand gestures), this stand-alone system is performing reliably, with an 80% average recognition rate using limited training data. This type of system has applications in security systems, man-machine communications and intelligent environments.DTI and Broadcom Ltd
- …