Search CORE

163 research outputs found

Embedded electronic systems driven by run-time reconfigurable hardware

Author: Fons Lluís Francisco
Publication venue: 'Universitat Rovira I Virgili'
Publication date: 01/01/2012
Field of study

Abstract This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnología hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando así su implementación física –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnología a través del prototipado de varias aplicaciones de ingeniería (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.Resum Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinàmicament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinàmica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant així la seva implementació física –àrea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware estàtic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tesis Doctorals en Xarxa

Repositori Institucional URV

Dynamically reconfigurable architecture for embedded computer vision systems

Author: Nieto Lareo Alejandro Manuel
Publication venue
Publication date: 01/01/2013
Field of study

The objective of this research work is to design, develop and implement a new architecture which integrates on the same chip all the processing levels of a complete Computer Vision system, so that the execution is efficient without compromising the power consumption while keeping a reduced cost. For this purpose, an analysis and classification of different mathematical operations and algorithms commonly used in Computer Vision are carried out, as well as a in-depth review of the image processing capabilities of current-generation hardware devices. This permits to determine the requirements and the key aspects for an efficient architecture. A representative set of algorithms is employed as benchmark to evaluate the proposed architecture, which is implemented on an FPGA-based system-on-chip. Finally, the prototype is compared to other related approaches in order to determine its advantages and weaknesses

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional da Universidade de Santiago de Compostela

A Kaon Trigger for FOPI : Development and Evaluation of a Trigger System for Strange Particles

Author: Brosch Oliver
Publication venue
Publication date: 01/01/2004
Field of study

During collisions of heavy ions at high energies at SIS/GSI, nuclear matter can be exposed to high densities and temperatures. The FOPI experiment detects the traces of the charged particles produced in the nuclear reaction. Their analysis, in particular with respect to the strange K mesons, can extend our knowledge about the structure of nuclear matter and the processes during the evolution of neutron stars and black holes. However, kaons are rarely observed, hence the derived physics results suffer from large uncertainties. In order to significantly enhance the kaon yield, a trigger, which is able to prevent the time consuming readout of the complete detector data for uninteresting events, was developed within the scope of this work. For that purpose, a Hough transform based algorithm was created. It reconstructs particle tracks from a small fraction of the data of the drift chamber CDC. A geometrical matching to the information from the new high-precision time-of-flight detector GRPC allows the determination of the found particles' species. In order to fulfill the requirements from the data bandwidth and the computing intensity, special-purpose processors are required. About 5 to 6 of the FPGA based MPRACE boards developed at Mannheim University can provide this performance cost-effectively. The consistent parallelization of the individual program steps makes the exploitation of the full power of MPRACE and thus a processing time of less than 100 us per event possible. Detailed simulations of the trigger systems show, that in experiments with light nuclei like nickel at beam energies of 1.93 GeV/u the K^+ yield can be enhanced by a factor of 6, and the K^- yield by a factor of 11

Heidelberger Dokumentenserver

GSI Repository

Features extraction for low-power face verification

Author: Stadelmann Patrick
Publication venue: 'University of Neuchatel'
Publication date: 19/06/2013
Field of study

Mobile communication devices now available on the market, such as so-called smartphones, are far more advanced than the first cellular phones that became very popular one decade ago. In addition to their historical purpose, namely enabling wireless vocal communications to be established nearly everywhere, they now provide most of the functionalities offered by computers. As such, they hold an ever-increasing amount of personal information and confidential data. However, the authentication method employed to prevent unauthorized access to the device is still based on the same PIN code mechanism, which is often set to an easy-to-guess combination of digits, or even altogether disabled. Stronger security can be achieved by resorting to biometrics, which verifies the identity of a person based on intrinsic physical or behavioral characteristics. Since most mobile phones are now equipped with an image sensor to provide digital camera functionality, biometric authentication based on the face modality is very interesting as it does not require a dedicated sensor, unlike e.g. fingerprint verification. Its perceived intrusiveness is furthermore very low, and it is generally well accepted by users. The deployment of face verification on mobile devices however requires overcoming two major challenges, which are the main issues addressed in this PhD thesis. Firstly, images acquired by a handheld device in an uncontrolled environment exhibit strong variations in illumination conditions. The extracted features on which biometric identification is based must therefore be robust to such perturbations. Secondly, the amount of energy available on battery-powered mobile devices is tightly constrained, calling for algorithms with low computational complexity, and for highly optimized implementations. So as to reduce the dependency on the illumination conditions, a low-complexity normalization technique for features extraction based on mathematical morphology is introduced in this thesis, and evaluated in conjunction with the Elastic Graph Matching (EGM) algorithm. Robustness to other perturbations, such as occlusions or geometric transformations, is also assessed and several improvements are proposed. In order to minimize the power consumption, the hardware architecture of a coprocessor dedicated to features extraction is proposed and described in VHDL. This component is designed to be integrated into a System-on-Chip (SoC) implementing the complete face verification process, including image acquisition, thereby enabling biometric face authentication to be performed entirely on the mobile device. Comparison of the proposed solution with state-of-the-art academic results and recently disclosed commercial products shows that the chosen approach is indeed much more efficient energy-wise

Infoscience - École polytechnique fédérale de Lausanne

Using reconfigurable computing technology to accelerate matrix decomposition and applications

Author: Wang Xinying
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2016
Field of study

Matrix decomposition plays an increasingly significant role in many scientific and engineering applications. Among numerous techniques, Singular Value Decomposition (SVD) and Eigenvalue Decomposition (EVD) are widely used as factorization tools to perform Principal Component Analysis for dimensionality reduction and pattern recognition in image processing, text mining and wireless communications, while QR Decomposition (QRD) and sparse LU Decomposition (LUD) are employed to solve the dense or sparse linear system of equations in bioinformatics, power system and computer vision. Matrix decompositions are computationally expensive and their sequential implementations often fail to meet the requirements of many time-sensitive applications. The emergence of reconfigurable computing has provided a flexible and low-cost opportunity to pursue high-performance parallel designs, and the use of FPGAs has shown promise in accelerating this class of computation. In this research, we have proposed and implemented several highly parallel FPGA-based architectures to accelerate matrix decompositions and their applications in data mining and signal processing. Specifically, in this dissertation we describe the following contributions: • We propose an efficient FPGA-based double-precision floating-point architecture for EVD, which can efficiently analyze large-scale matrices. • We implement a floating-point Hestenes-Jacobi architecture for SVD, which is capable of analyzing arbitrary sized matrices. • We introduce a novel deeply pipelined reconfigurable architecture for QRD, which can be dynamically configured to perform either Householder transformation or Givens rotation in a manner that takes advantage of the strengths of each. • We design a configurable architecture for sparse LUD that supports both symmetric and asymmetric sparse matrices with arbitrary sparsity patterns. • By further extending the proposed hardware solution for SVD, we parallelize a popular text mining tool-Latent Semantic Indexing with an FPGA-based architecture. • We present a configurable architecture to accelerate Homotopy l1-minimization, in which the modification of the proposed FPGA architecture for sparse LUD is used at its core to parallelize both Cholesky decomposition and rank-1 update. Our experimental results using an FPGA-based acceleration system indicate the efficiency of our proposed novel architectures, with application and dimension-dependent speedups over an optimized software implementation that range from 1.5ÃÂ to 43.6ÃÂ in terms of computation time

Digital Repository @ Iowa State University (ISU)

Acceleration Techniques for Sparse Recovery Based Plane-wave Decomposition of a Sound Field

Author: Samarawickrama Mahendra
Publication venue: Faculty of Engineering and Information Technologies, School of Electrical and Information Engineering
Publication date: 28/02/2017
Field of study

Plane-wave decomposition by sparse recovery is a reliable and accurate technique for plane-wave decomposition which can be used for source localization, beamforming, etc. In this work, we introduce techniques to accelerate the plane-wave decomposition by sparse recovery. The method consists of two main algorithms which are spherical Fourier transformation (SFT) and sparse recovery. Comparing the two algorithms, the sparse recovery is the most computationally intensive. We implement the SFT on an FPGA and the sparse recovery on a multithreaded computing platform. Then the multithreaded computing platform could be fully utilized for the sparse recovery. On the other hand, implementing the SFT on an FPGA helps to flexibly integrate the microphones and improve the portability of the microphone array. For implementing the SFT on an FPGA, we develop a scalable FPGA design model that enables the quick design of the SFT architecture on FPGAs. The model considers the number of microphones, the number of SFT channels and the cost of the FPGA and provides the design of a resource optimized and cost-effective FPGA architecture as the output. Then we investigate the performance of the sparse recovery algorithm executed on various multithreaded computing platforms (i.e., chip-multiprocessor, multiprocessor, GPU, manycore). Finally, we investigate the influence of modifying the dictionary size on the computational performance and the accuracy of the sparse recovery algorithms. We introduce novel sparse-recovery techniques which use non-uniform dictionaries to improve the performance of the sparse recovery on a parallel architecture

Sydney eScholarship