26 research outputs found

    Design and management of image processing pipelines within CPS : Acquired experience towards the end of the FitOptiVis ECSEL Project

    Get PDF
    Cyber-Physical Systems (CPSs) are dynamic and reactive systems interacting with processes, environment and, sometimes, humans. They are often distributed with sensors and actuators, characterized for being smart, adaptive, predictive and react in real-time. Indeed, image- and video-processing pipelines are a prime source for environmental information for systems allowing them to take better decisions according to what they see. Therefore, in FitOptiVis, we are developing novel methods and tools to integrate complex image- and video-processing pipelines. FitOptiVis aims to deliver a reference architecture for describing and optimizing quality and resource management for imaging and video pipelines in CPSs both at design- and run-time. The architecture is concretized in low-power, high-performance, smart components, and in methods and tools for combined design-time and run-time multi-objective optimization and adaptation within system and environment constraints.Peer reviewe

    Optimization Techniques for Parallel Programming of Embedded Many-Core Computing Platforms

    Get PDF
    Nowadays many-core computing platforms are widely adopted as a viable solution to accelerate compute-intensive workloads at different scales, from low-cost devices to HPC nodes. It is well established that heterogeneous platforms including a general-purpose host processor and a parallel programmable accelerator have the potential to dramatically increase the peak performance/Watt of computing architectures. However the adoption of these platforms further complicates application development, whereas it is widely acknowledged that software development is a critical activity for the platform design. The introduction of parallel architectures raises the need for programming paradigms capable of effectively leveraging an increasing number of processors, from two to thousands. In this scenario the study of optimization techniques to program parallel accelerators is paramount for two main objectives: first, improving performance and energy efficiency of the platform, which are key metrics for both embedded and HPC systems; second, enforcing software engineering practices with the aim to guarantee code quality and reduce software costs. This thesis presents a set of techniques that have been studied and designed to achieve these objectives overcoming the current state-of-the-art. As a first contribution, we discuss the use of OpenMP tasking as a general-purpose programming model to support the execution of diverse workloads, and we introduce a set of runtime-level techniques to support fine-grain tasks on high-end many-core accelerators (devices with a power consumption greater than 10W). Then we focus our attention on embedded computer vision (CV), with the aim to show how to achieve best performance by exploiting the characteristics of a specific application domain. To further reduce the power consumption of parallel accelerators beyond the current technological limits, we describe an approach based on the principles of approximate computing, which implies modification to the program semantics and proper hardware support at the architectural level

    Reconfigurable Computing Systems for Robotics using a Component-Oriented Approach

    Get PDF
    Robotic platforms are becoming more complex due to the wide range of modern applications, including multiple heterogeneous sensors and actuators. In order to comply with real-time and power-consumption constraints, these systems need to process a large amount of heterogeneous data from multiple sensors and take action (via actuators), which represents a problem as the resources of these systems have limitations in memory storage, bandwidth, and computational power. Field Programmable Gate Arrays (FPGAs) are programmable logic devices that offer high-speed parallel processing. FPGAs are particularly well-suited for applications that require real-time processing, high bandwidth, and low latency. One of the fundamental advantages of FPGAs is their flexibility in designing hardware tailored to specific needs, making them adaptable to a wide range of applications. They can be programmed to pre-process data close to sensors, which reduces the amount of data that needs to be transferred to other computing resources, improving overall system efficiency. Additionally, the reprogrammability of FPGAs enables them to be repurposed for different applications, providing a cost-effective solution that needs to adapt quickly to changing demands. FPGAs' performance per watt is close to that of Application-Specific Integrated Circuits (ASICs), with the added advantage of being reprogrammable. Despite all the advantages of FPGAs (e.g., energy efficiency, computing capabilities), the robotics community has not fully included them so far as part of their systems for several reasons. First, designing FPGA-based solutions requires hardware knowledge and longer development times as their programmability is more challenging than Central Processing Units (CPUs) or Graphics Processing Units (GPUs). Second, porting a robotics application (or parts of it) from software to an accelerator requires adequate interfaces between software and FPGAs. Third, the robotics workflow is already complex on its own, combining several fields such as mechanics, electronics, and software. There have been partial contributions in the state-of-the-art for FPGAs as part of robotics systems. However, a study of FPGAs as a whole for robotics systems is missing in the literature, which is the primary goal of this dissertation. Three main objectives have been established to accomplish this. (1) Define all components required for an FPGAs-based system for robotics applications as a whole. (2) Establish how all the defined components are related. (3) With the help of Model-Driven Engineering (MDE) techniques, generate these components, deploy them, and integrate them into existing solutions. The component-oriented approach proposed in this dissertation provides a proper solution for designing and implementing FPGA-based designs for robotics applications. The modular architecture, the tool 'FPGA Interfaces for Robotics Middlewares' (FIRM), and the toolchain 'FPGA Architectures for Robotics' (FAR) provide a set of tools and a comprehensive design process that enables the development of complex FPGA-based designs more straightforwardly and efficiently. The component-oriented approach contributed to the state-of-the-art in FPGA-based designs significantly for robotics applications and helps to promote their wider adoption and use by specialists with little FPGA knowledge

    Heterogeneous Architectures For Parallel Acceleration

    Get PDF
    To enable a new generation of digital computing applications, the greatest challenge is to provide a better level of energy efficiency (intended as the performance that a system can provide within a certain power budget) without giving up a systems's flexibility. This constraint applies to digital system across all scales, starting from ultra-low power implanted devices up to datacenters for high-performance computing and for the "cloud". In this thesis, we show that architectural heterogeneity is the key to provide this efficiency and to respond to many of the challenges of tomorrow's computer architecture - and at the same time we show methodologies to introduce it with little or no loss in terms of flexibility. In particular, we show that heterogeneity can be employed to tackle the "walls" that impede further development of new computing applications: the utilization wall, i.e. the impossibility to keep all transistors on in deeply integrated chips, and the "data deluge", i.e. the amount of data to be processed that is scaling up much faster than the computing performance and efficiency. We introduce a methodology to improve heterogeneous design exploration of tightly coupled clusters; moreover we propose a fractal heterogeneity architecture that is a parallel accelerator for low-power sensor nodes, and is itself internally heterogeneous thanks to an heterogeneous coprocessor for brain-inspired computing. This platform, which is silicon-proven, can lead to more than 100x improvement in terms of energy efficiency with respect to typical computing nodes used within the same domain, enabling the application of complex algorithms, vastly more performance-hungry than the current state-of-the-art in the ULP computing domain

    Methoden und Beschreibungssprachen zur Modellierung und Verifikation vonSchaltungen und Systemen: MBMV 2015 - Tagungsband, Chemnitz, 03. - 04. März 2015

    Get PDF
    Der Workshop Methoden und Beschreibungssprachen zur Modellierung und Verifikation von Schaltungen und Systemen (MBMV 2015) findet nun schon zum 18. mal statt. Ausrichter sind in diesem Jahr die Professur Schaltkreis- und Systementwurf der Technischen Universität Chemnitz und das Steinbeis-Forschungszentrum Systementwurf und Test. Der Workshop hat es sich zum Ziel gesetzt, neueste Trends, Ergebnisse und aktuelle Probleme auf dem Gebiet der Methoden zur Modellierung und Verifikation sowie der Beschreibungssprachen digitaler, analoger und Mixed-Signal-Schaltungen zu diskutieren. Er soll somit ein Forum zum Ideenaustausch sein. Weiterhin bietet der Workshop eine Plattform für den Austausch zwischen Forschung und Industrie sowie zur Pflege bestehender und zur Knüpfung neuer Kontakte. Jungen Wissenschaftlern erlaubt er, ihre Ideen und Ansätze einem breiten Publikum aus Wissenschaft und Wirtschaft zu präsentieren und im Rahmen der Veranstaltung auch fundiert zu diskutieren. Sein langjähriges Bestehen hat ihn zu einer festen Größe in vielen Veranstaltungskalendern gemacht. Traditionell sind auch die Treffen der ITGFachgruppen an den Workshop angegliedert. In diesem Jahr nutzen zwei im Rahmen der InnoProfile-Transfer-Initiative durch das Bundesministerium für Bildung und Forschung geförderte Projekte den Workshop, um in zwei eigenen Tracks ihre Forschungsergebnisse einem breiten Publikum zu präsentieren. Vertreter der Projekte Generische Plattform für Systemzuverlässigkeit und Verifikation (GPZV) und GINKO - Generische Infrastruktur zur nahtlosen energetischen Kopplung von Elektrofahrzeugen stellen Teile ihrer gegenwärtigen Arbeiten vor. Dies bereichert denWorkshop durch zusätzliche Themenschwerpunkte und bietet eine wertvolle Ergänzung zu den Beiträgen der Autoren. [... aus dem Vorwort

    GPUを用いた高解像度画像の実時間ステレオマッチングシステムの研究

    Get PDF
    筑波大学 (University of Tsukuba)201

    A Practical Hardware Implementation of Systemic Computation

    Get PDF
    It is widely accepted that natural computation, such as brain computation, is far superior to typical computational approaches addressing tasks such as learning and parallel processing. As conventional silicon-based technologies are about to reach their physical limits, researchers have drawn inspiration from nature to found new computational paradigms. Such a newly-conceived paradigm is Systemic Computation (SC). SC is a bio-inspired model of computation. It incorporates natural characteristics and defines a massively parallel non-von Neumann computer architecture that can model natural systems efficiently. This thesis investigates the viability and utility of a Systemic Computation hardware implementation, since prior software-based approaches have proved inadequate in terms of performance and flexibility. This is achieved by addressing three main research challenges regarding the level of support for the natural properties of SC, the design of its implied architecture and methods to make the implementation practical and efficient. Various hardware-based approaches to Natural Computation are reviewed and their compatibility and suitability, with respect to the SC paradigm, is investigated. FPGAs are identified as the most appropriate implementation platform through critical evaluation and the first prototype Hardware Architecture of Systemic computation (HAoS) is presented. HAoS is a novel custom digital design, which takes advantage of the inbuilt parallelism of an FPGA and the highly efficient matching capability of a Ternary Content Addressable Memory. It provides basic processing capabilities in order to minimize time-demanding data transfers, while the optional use of a CPU provides high-level processing support. It is optimized and extended to a practical hardware platform accompanied by a software framework to provide an efficient SC programming solution. The suggested platform is evaluated using three bio-inspired models and analysis shows that it satisfies the research challenges and provides an effective solution in terms of efficiency versus flexibility trade-off

    Ubiquitous volume rendering in the web platform

    Get PDF
    176 p.The main thesis hypothesis is that ubiquitous volume rendering can be achieved using WebGL. The thesis enumerates the challenges that should be met to achieve that goal. The results allow web content developers the integration of interactive volume rendering within standard HTML5 web pages. Content developers only need to declare the X3D nodes that provide the rendering characteristics they desire. In contrast to the systems that provide specific GPU programs, the presented architecture creates automatically the GPU code required by the WebGL graphics pipeline. This code is generated directly from the X3D nodes declared in the virtual scene. Therefore, content developers do not need to know about the GPU.The thesis extends previous research on web compatible volume data structures for WebGL, ray-casting hybrid surface and volumetric rendering, progressive volume rendering and some specific problems related to the visualization of medical datasets. Finally, the thesis contributes to the X3D standard with some proposals to extend and improve the volume rendering component. The proposals are in an advance stage towards their acceptance by the Web3D Consortium

    Ubiquitous volume rendering in the web platform

    Get PDF
    176 p.The main thesis hypothesis is that ubiquitous volume rendering can be achieved using WebGL. The thesis enumerates the challenges that should be met to achieve that goal. The results allow web content developers the integration of interactive volume rendering within standard HTML5 web pages. Content developers only need to declare the X3D nodes that provide the rendering characteristics they desire. In contrast to the systems that provide specific GPU programs, the presented architecture creates automatically the GPU code required by the WebGL graphics pipeline. This code is generated directly from the X3D nodes declared in the virtual scene. Therefore, content developers do not need to know about the GPU.The thesis extends previous research on web compatible volume data structures for WebGL, ray-casting hybrid surface and volumetric rendering, progressive volume rendering and some specific problems related to the visualization of medical datasets. Finally, the thesis contributes to the X3D standard with some proposals to extend and improve the volume rendering component. The proposals are in an advance stage towards their acceptance by the Web3D Consortium

    3rd Many-core Applications Research Community (MARC) Symposium. (KIT Scientific Reports ; 7598)

    Get PDF
    This manuscript includes recent scientific work regarding the Intel Single Chip Cloud computer and describes approaches for novel approaches for programming and run-time organization
    corecore