17,748 research outputs found
Time-Shared Execution of Realtime Computer Vision Pipelines by Dynamic Partial Reconfiguration
This paper presents an FPGA runtime framework that demonstrates the
feasibility of using dynamic partial reconfiguration (DPR) for time-sharing an
FPGA by multiple realtime computer vision pipelines. The presented time-sharing
runtime framework manages an FPGA fabric that can be round-robin time-shared by
different pipelines at the time scale of individual frames. In this new
use-case, the challenge is to achieve useful performance despite high
reconfiguration time. The paper describes the basic runtime support as well as
four optimizations necessary to achieve realtime performance given the
limitations of DPR on today's FPGAs. The paper provides a characterization of a
working runtime framework prototype on a Xilinx ZC706 development board. The
paper also reports the performance of realtime computer vision pipelines when
time-shared
FASTCUDA: Open Source FPGA Accelerator & Hardware-Software Codesign Toolset for CUDA Kernels
Using FPGAs as hardware accelerators that communicate with a central CPU is becoming a common practice in the embedded design world but there is no standard methodology and toolset to facilitate this path yet. On the other hand, languages such as CUDA and OpenCL provide standard development environments for Graphical Processing Unit (GPU) programming. FASTCUDA is a platform that provides the necessary software toolset, hardware architecture, and design methodology to efficiently adapt the CUDA approach into a new FPGA design flow. With FASTCUDA, the CUDA kernels of a CUDA-based application are partitioned into two groups with minimal user intervention: those that are compiled and executed in parallel software, and those that are synthesized and implemented in hardware. A modern low power FPGA can provide the processing power (via numerous embedded micro-CPUs) and the logic capacity for both the software and hardware implementations of the CUDA kernels. This paper describes the system requirements and the architectural decisions behind the FASTCUDA approach
An FPGA-based infant monitoring system
We have designed an automated visual surveillance system for monitoring sleeping infants. The low-level image
processing is implemented on an embedded Xilinx’s Virtex
II XC2v6000 FPGA and quantifies the level of scene activity using a specially designed background subtraction algorithm. We present our algorithm and show how we have
optimised it for this platform
- …