346 research outputs found

    Study of optical techniques for the Ames unitary wind tunnel: Digital image processing, part 6

    Get PDF
    A survey of digital image processing techniques and processing systems for aerodynamic images has been conducted. These images covered many types of flows and were generated by many types of flow diagnostics. These include laser vapor screens, infrared cameras, laser holographic interferometry, Schlieren, and luminescent paints. Some general digital image processing systems, imaging networks, optical sensors, and image computing chips were briefly reviewed. Possible digital imaging network systems for the Ames Unitary Wind Tunnel were explored

    Generic programming methods for the real time implementation of a MRF based motion detection algorithm on a multi-processor DSP with multidimensional DMA

    Get PDF
    Cette communication adresse une double problématique. D'abord, nous soulignons le besoin de méthodes de programmation génériques pour l'implémentation temps réel (TR) d'algorithmes de traitement d'image bas niveau complexes sur des architectures DSPs parallèles à base de multiprocesseurs exploitant le parallélisme au niveau instructions et de DMAs multidimensionnels. Ensuite, nous introduisons le besoin d'une implémentation TR d'un algorithme de détection de mouvement sur des architectures compatibles avec des systèmes bas coût embarqués. Pour répondre à ces besoins, nous montrons comment une méthodologie de gestion des flots synchrones reposant sur le DMA et qui se veut dynamique et générique sur le plan des configurations de traitement (suivant la nature des chaînes de traitement, la taille des images et du nombre de processeurs impliqués) peut être utilisée pour l'implémentation d'une méthode Markovienne de détection de mouvement sur l'architecture parallèle avancée du TMS320C80. Cette étude de cas montre l'adéquation de notre méthode et introduit un facteur d'accélération de 4 par rapport aux durées de traitement précédemment publiées de l'algorithme ciblé. Plus encore, on estime que le traitement TR est possible sur des images 2562 avec un système C80 optimal

    Machine Learning for Microcontroller-Class Hardware -- A Review

    Full text link
    The advancements in machine learning opened a new opportunity to bring intelligence to the low-end Internet-of-Things nodes such as microcontrollers. Conventional machine learning deployment has high memory and compute footprint hindering their direct deployment on ultra resource-constrained microcontrollers. This paper highlights the unique requirements of enabling onboard machine learning for microcontroller class devices. Researchers use a specialized model development workflow for resource-limited applications to ensure the compute and latency budget is within the device limits while still maintaining the desired performance. We characterize a closed-loop widely applicable workflow of machine learning model development for microcontroller class devices and show that several classes of applications adopt a specific instance of it. We present both qualitative and numerical insights into different stages of model development by showcasing several use cases. Finally, we identify the open research challenges and unsolved questions demanding careful considerations moving forward.Comment: Accepted for publication at IEEE Sensors Journa

    Evolvable hardware system for automatic optical inspection

    Get PDF

    Dynamically reconfigurable architecture for embedded computer vision systems

    Get PDF
    The objective of this research work is to design, develop and implement a new architecture which integrates on the same chip all the processing levels of a complete Computer Vision system, so that the execution is efficient without compromising the power consumption while keeping a reduced cost. For this purpose, an analysis and classification of different mathematical operations and algorithms commonly used in Computer Vision are carried out, as well as a in-depth review of the image processing capabilities of current-generation hardware devices. This permits to determine the requirements and the key aspects for an efficient architecture. A representative set of algorithms is employed as benchmark to evaluate the proposed architecture, which is implemented on an FPGA-based system-on-chip. Finally, the prototype is compared to other related approaches in order to determine its advantages and weaknesses

    Computer vision algorithms on reconfigurable logic arrays

    Full text link

    On the design of multimedia architectures : proceedings of a one-day workshop, Eindhoven, December 18, 2003

    Get PDF

    Hardware and software optimization of fourier transform infrared spectrometry on hybrid-FPGAs

    Get PDF
    With the increasing complexity of today’s spacecrafts, there exists a concern that the on-board flight computer may be overburdened with various processing tasks. Currently available processors used by NASA are struggling to meet the requirements of scientific experiments [1, 2]. A new computational platform will soon be needed to contend with the increasing demands of future space missions. Recently developed hybrid field-programmable gate arrays (FPGA) offer the versatility of running diverse software applications on embedded processors while at the same time taking advantage of reconfigurable hardware resources, all on the same chip package. These tightly coupled HW/SW systems consume less power than general-purpose singleboard computers (SBC) and promise breakthrough performance previously impossible with traditional processors and reconfigurable devices. This thesis takes an existing floating-point intensive data processing algorithm, used for on-board spacecraft Fourier transform infrared (FTIR) spectrometry, ports it into the embedded PowerPC 405 (PPC405) processor, and evaluates system performance after applying different hardware and software optimizations and architectural configurations of the hybrid-FPGA. The hardware optimizations include Xilinx’s floating-point unit (FPU) for efficient single-precision floating-point calculations and a dedicated single-precision dot-product co-processor assembled from basic floating-point operator cores. The software optimizations include utilizing a non-ANSI single-precision math library as well as IBM’s PowerPC performance libraries recompiled for double-precision arithmetic only. The outcome of this thesis is a fully functional, optimized FTIR spectrometry algorithm implemented on a hybrid-FPGA. The computational and power performance of this system is evaluated and compared to a general-purpose SBC currently used for spacecraft data processing. Suggestions for future work, including a dual-processor concept, are given

    On the design of multimedia architectures : proceedings of a one-day workshop, Eindhoven, December 18, 2003

    Get PDF

    TinyVers: A Tiny Versatile System-on-chip with State-Retentive eMRAM for ML Inference at the Extreme Edge

    Full text link
    Extreme edge devices or Internet-of-thing nodes require both ultra-low power always-on processing as well as the ability to do on-demand sampling and processing. Moreover, support for IoT applications like voice recognition, machine monitoring, etc., requires the ability to execute a wide range of ML workloads. This brings challenges in hardware design to build flexible processors operating in ultra-low power regime. This paper presents TinyVers, a tiny versatile ultra-low power ML system-on-chip to enable enhanced intelligence at the Extreme Edge. TinyVers exploits dataflow reconfiguration to enable multi-modal support and aggressive on-chip power management for duty-cycling to enable smart sensing applications. The SoC combines a RISC-V host processor, a 17 TOPS/W dataflow reconfigurable ML accelerator, a 1.7 μ\muW deep sleep wake-up controller, and an eMRAM for boot code and ML parameter retention. The SoC can perform up to 17.6 GOPS while achieving a power consumption range from 1.7 μ\muW-20 mW. Multiple ML workloads aimed for diverse applications are mapped on the SoC to showcase its flexibility and efficiency. All the models achieve 1-2 TOPS/W of energy efficiency with power consumption below 230 μ\muW in continuous operation. In a duty-cycling use case for machine monitoring, this power is reduced to below 10 μ\muW.Comment: Accepted in IEEE Journal of Solid-State Circuit
    corecore