294 research outputs found
A Future for Integrated Diagnostic Helping
International audienceMedical systems used for exploration or diagnostic helping impose high applicative constraints such as real time image acquisition and displaying. A large part of computing requirement of these systems is devoted to image processing. This chapter provides clues to transfer consumers computing architecture approaches to the benefit of medical applications. The goal is to obtain fully integrated devices from diagnostic helping to autonomous lab on chip while taking into account medical domain specific constraints.This expertise is structured as follows: the first part analyzes vision based medical applications in order to extract essentials processing blocks and to show the similarities between consumerâs and medical vision based applications. The second part is devoted to the determination of elementary operators which are mostly needed in both domains. Computing capacities that are required by these operators and applications are compared to the state-of-the-art architectures in order to define an efficient algorithm-architecture adequation. Finally this part demonstrates that it's possible to use highly constrained computing architectures designed for consumers handled devices in application to medical domain. This is based on the example of a high definition (HD) video processing architecture designed to be integrated into smart phone or highly embedded components. This expertise paves the way for the industrialisation of intergraded autonomous diagnostichelping devices, by showing the feasibility of such systems. Their future use would also free the medical staff from many logistical constraints due the deployment of todayâs cumbersome systems
The Design of a System Architecture for Mobile Multimedia Computers
This chapter discusses the system architecture of a portable computer, called Mobile Digital Companion, which provides support for handling multimedia applications energy efficiently. Because battery life is limited and battery weight is an important factor for the size and the weight of the Mobile Digital Companion, energy management plays a crucial role in the architecture. As the Companion must remain usable in a variety of environments, it has to be flexible and adaptable to various operating conditions. The Mobile Digital Companion has an unconventional architecture that saves energy by using system decomposition at different levels of the architecture and exploits locality of reference with dedicated, optimised modules. The approach is based on dedicated functionality and the extensive use of energy reduction techniques at all levels of system design. The system has an architecture with a general-purpose processor accompanied by a set of heterogeneous autonomous programmable modules, each providing an energy efficient implementation of dedicated tasks. A reconfigurable internal communication network switch exploits locality of reference and eliminates wasteful data copies
FPGA-Based Processor Acceleration for Image Processing Applications
FPGA-based embedded image processing systems offer considerable computing resources but present programming challenges when compared to software systems. The paper describes an approach based on an FPGA-based soft processor called Image Processing Processor (IPPro) which can operate up to 337 MHz on a high-end Xilinx FPGA family and gives details of the dataflow-based programming environment. The approach is demonstrated for a k-means clustering operation and a traffic sign recognition application, both of which have been prototyped on an Avnet Zedboard that has Xilinx Zynq-7000 system-on-chip (SoC). A number of parallel dataflow mapping options were explored giving a speed-up of 8 times for the k-means clustering using 16 IPPro cores, and a speed-up of 9.6 times for the morphology filter operation of the traffic sign recognition using 16 IPPro cores compared to their equivalent ARM-based software implementations. We show that for k-means clustering, the 16 IPPro cores implementation is 57, 28 and 1.7 times more power efficient (fps/W) than ARM Cortex-A7 CPU, nVIDIA GeForce GTX980 GPU and ARM Mali-T628 embedded GPU respectively
Towards the development of flexible, reliable, reconfigurable, and high-performance imaging systems
Current FPGAs can implement large systems because of the high density of
reconfigurable logic resources in a single chip. FPGAs are comprehensive devices
that combine flexibility and high performance in the same platform compared to
other platform such as General-Purpose Processors (GPPs) and Application Specific
Integrated Circuits (ASICs). The flexibility of modern FPGAs is further enhanced by
introducing Dynamic Partial Reconfiguration (DPR) feature, which allows for
changing the functionality of part of the system while other parts are functioning.
FPGAs became an important platform for digital image processing applications
because of the aforementioned features. They can fulfil the need of efficient and
flexible platforms that execute imaging tasks efficiently as well as the reliably with
low power, high performance and high flexibility. The use of FPGAs as accelerators
for image processing outperforms most of the current solutions. Current FPGA
solutions can to load part of the imaging application that needs high computational
power on dedicated reconfigurable hardware accelerators while other parts are
working on the traditional solution to increase the system performance. Moreover,
the use of the DPR feature enhances the flexibility of image processing further by
swapping accelerators in and out at run-time. The use of fault mitigation techniques
in FPGAs enables imaging applications to operate in harsh environments following
the fact that FPGAs are sensitive to radiation and extreme conditions.
The aim of this thesis is to present a platform for efficient implementations of
imaging tasks. The research uses FPGAs as the key component of this platform and
uses the concept of DPR to increase the performance, flexibility, to reduce the power
dissipation and to expand the cycle of possible imaging applications. In this context,
it proposes the use of FPGAs to accelerate the Image Processing Pipeline (IPP)
stages, the core part of most imaging devices. The thesis has a number of novel
concepts. The first novel concept is the use of FPGA hardware environment and
DPR feature to increase the parallelism and achieve high flexibility. The concept also
increases the performance and reduces the power consumption and area utilisation.
Based on this concept, the following implementations are presented in this thesis: An
implementation of Adams Hamilton Demosaicing algorithm for camera colour
interpolation, which exploits the FPGA parallelism to outperform other equivalents.
In addition, an implementation of Automatic White Balance (AWB), another IPP
stage that employs DPR feature to prove the mentioned novelty aspects. Another
novel concept in this thesis is presented in chapter 6, which uses DPR feature to
develop a novel flexible imaging system that requires less logic and can be
implemented in small FPGAs. The system can be employed as a template for any
imaging application with no limitation. Moreover, discussed in this thesis is a novel
reliable version of the imaging system that adopts novel techniques including
scrubbing, Built-In Self Test (BIST), and Triple Modular Redundancy (TMR) to
detect and correct errors using the Internal Configuration Access Port (ICAP)
primitive. These techniques exploit the datapath-based nature of the implemented
imaging system to improve the system's overall reliability. The thesis presents a
proposal for integrating the imaging system with the Robust Reliable Reconfigurable
Real-Time Heterogeneous Operating System (R4THOS) to get the best out of the
system. The proposal shows the suitability of the proposed DPR imaging system to
be used as part of the core system of autonomous cars because of its unbounded
flexibility. These novel works are presented in a number of publications as shown in section
1.3 later in this thesis
Design and management of image processing pipelines within CPS : Acquired experience towards the end of the FitOptiVis ECSEL Project
Cyber-Physical Systems (CPSs) are dynamic and reactive systems interacting with processes, environment and, sometimes, humans. They are often distributed with sensors and actuators, characterized for being smart, adaptive, predictive and react in real-time. Indeed, image- and video-processing pipelines are a prime source for environmental information for systems allowing them to take better decisions according to what they see. Therefore, in FitOptiVis, we are developing novel methods and tools to integrate complex image- and video-processing pipelines. FitOptiVis aims to deliver a reference architecture for describing and optimizing quality and resource management for imaging and video pipelines in CPSs both at design- and run-time. The architecture is concretized in low-power, high-performance, smart components, and in methods and tools for combined design-time and run-time multi-objective optimization and adaptation within system and environment constraints.Peer reviewe
Dynamically reconfigurable architecture for embedded computer vision systems
The objective of this research work is to design, develop and implement a new architecture which integrates on the same chip all the processing levels of a complete Computer Vision system, so that the execution is efficient without compromising the power consumption while keeping a reduced cost. For this purpose, an analysis and classification of different mathematical operations and algorithms commonly used in Computer Vision are carried out, as well as a in-depth review of the image processing capabilities of current-generation hardware devices. This permits to determine the requirements and the key aspects for an efficient architecture. A representative set of algorithms is employed as benchmark to evaluate the proposed architecture, which is implemented on an FPGA-based system-on-chip. Finally, the prototype is compared to other related approaches in order to determine its advantages and weaknesses
Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)
ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability
Memory hierarchy and data communication in heterogeneous reconfigurable SoCs
The miniaturization race in the hardware industry aiming at continuous increasing
of transistor density on a die does not bring respective application performance
improvements any more. One of the most promising alternatives is to
exploit a heterogeneous nature of common applications in hardware. Supported by
reconfigurable computation, which has already proved its efficiency in accelerating
data intensive applications, this concept promises a breakthrough in contemporary
technology development.
Memory organization in such heterogeneous reconfigurable architectures becomes
very critical. Two primary aspects introduce a sophisticated trade-off. On
the one hand, a memory subsystem should provide well organized distributed data
structure and guarantee the required data bandwidth. On the other hand, it should
hide the heterogeneous hardware structure from the end-user, in order to support
feasible high-level programmability of the system.
This thesis work explores the heterogeneous reconfigurable hardware architectures
and presents possible solutions to cope the problem of memory organization
and data structure. By the example of the MORPHEUS heterogeneous platform,
the discussion follows the complete design cycle, starting from decision making
and justification, until hardware realization. Particular emphasis is made on the
methods to support high system performance, meet application requirements, and
provide a user-friendly programmer interface.
As a result, the research introduces a complete heterogeneous platform enhanced
with a hierarchical memory organization, which copes with its task by
means of separating computation from communication, providing reconfigurable
engines with computation and configuration data, and unification of heterogeneous
computational devices using local storage buffers. It is distinguished from the
related solutions by distributed data-flow organization, specifically engineered
mechanisms to operate with data on local domains, particular communication infrastructure
based on Network-on-Chip, and thorough methods to prevent computation
and communication stalls. In addition, a novel advanced technique to accelerate
memory access was developed and implemented
SYSTEM-ON-A-CHIP (SOC)-BASED HARDWARE ACCELERATION FOR HUMAN ACTION RECOGNITION WITH CORE COMPONENTS
Today, the implementation of machine vision algorithms on embedded platforms or in portable systems is growing rapidly due to the demand for machine vision in daily human life. Among the applications of machine vision, human action and activity recognition has become an active research area, and market demand for providing integrated smart security systems is growing rapidly. Among the available approaches, embedded vision is in the top tier; however, current embedded platforms may not be able to fully exploit the potential performance of machine vision algorithms, especially in terms of low power consumption. Complex algorithms can impose immense computation and communication demands, especially action recognition algorithms, which require various stages of preprocessing, processing and machine learning blocks that need to operate concurrently. The market demands embedded platforms that operate with a power consumption of only a few watts. Attempts have been mad to improve the performance of traditional embedded approaches by adding more powerful processors; this solution may solve the computation problem but increases the power consumption. System-on-a-chip eld-programmable gate arrays (SoC-FPGAs) have emerged as a major architecture approach for improving power eciency while increasing computational performance. In a SoC-FPGA, an embedded processor and an FPGA serving as an accelerator are fabricated in the same die to simultaneously improve power consumption and performance. Still, current SoC-FPGA-based vision implementations either shy away from supporting complex and adaptive vision algorithms or operate at very limited resolutions due to the immense communication and computation demands. The aim of this research is to develop a SoC-based hardware acceleration workflow for the realization of advanced vision algorithms. Hardware acceleration can improve performance for highly complex mathematical calculations or repeated functions. The performance of a SoC system can thus be improved by using hardware acceleration method to accelerate the element that incurs the highest performance overhead. The outcome of this research could be used for the implementation of various vision algorithms, such as face recognition, object detection or object tracking, on embedded platforms. The contributions of SoC-based hardware acceleration for hardware-software codesign platforms include the following: (1) development of frameworks for complex human action recognition in both 2D and 3D; (2) realization of a framework with four main implemented IPs, namely, foreground and background subtraction (foreground probability), human detection, 2D/3D point-of-interest detection and feature extraction, and OS-ELM as a machine learning algorithm for action identication; (3) use of an FPGA-based hardware acceleration method to resolve system bottlenecks and improve system performance; and (4) measurement and analysis of system specications, such as the acceleration factor, power consumption, and resource utilization. Experimental results show that the proposed SoC-based hardware acceleration approach provides better performance in terms of the acceleration factor, resource utilization and power consumption among all recent works. In addition, a comparison of the accuracy of the framework that runs on the proposed embedded platform (SoCFPGA) with the accuracy of other PC-based frameworks shows that the proposed approach outperforms most other approaches
- âŠ