836 research outputs found

    HPC Platform for Railway Safety-Critical Functionalities Based on Artificial Intelligence

    Get PDF
    The automation of railroad operations is a rapidly growing industry. In 2023, a new European standard for the automated Grade of Automation (GoA) 2 over European Train Control System (ETCS) driving is anticipated. Meanwhile, railway stakeholders are already planning their research initiatives for driverless and unattended autonomous driving systems. As a result, the industry is particularly active in research regarding perception technologies based on Computer Vision (CV) and Artificial Intelligence (AI), with outstanding results at the application level. However, executing high-performance and safety-critical applications on embedded systems and in real-time is a challenge. There are not many commercially available solutions, since High-Performance Computing (HPC) platforms are typically seen as being beyond the business of safety-critical systems. This work proposes a novel safety-critical and high-performance computing platform for CV- and AI-enhanced technology execution used for automatic accurate stopping and safe passenger transfer railway functionalities. The resulting computing platform is compatible with the majority of widely-used AI inference methodologies, AI model architectures, and AI model formats thanks to its design, which enables process separation, redundant execution, and HW acceleration in a transparent manner. The proposed technology increases the portability of railway applications into embedded systems, isolates crucial operations, and effectively and securely maintains system resources.The novel approach presented in this work is being developed as a specific railway use case for autonomous train operation into SELENE European research project. This project has received funding from RIA—Research and Innovation action under grant agreement No. 871467

    Towards a Common Software/Hardware Methodology for Future Advanced Driver Assistance Systems

    Get PDF
    The European research project DESERVE (DEvelopment platform for Safe and Efficient dRiVE, 2012-2015) had the aim of designing and developing a platform tool to cope with the continuously increasing complexity and the simultaneous need to reduce cost for future embedded Advanced Driver Assistance Systems (ADAS). For this purpose, the DESERVE platform profits from cross-domain software reuse, standardization of automotive software component interfaces, and easy but safety-compliant integration of heterogeneous modules. This enables the development of a new generation of ADAS applications, which challengingly combine different functions, sensors, actuators, hardware platforms, and Human Machine Interfaces (HMI). This book presents the different results of the DESERVE project concerning the ADAS development platform, test case functions, and validation and evaluation of different approaches. The reader is invited to substantiate the content of this book with the deliverables published during the DESERVE project. Technical topics discussed in this book include:Modern ADAS development platforms;Design space exploration;Driving modelling;Video-based and Radar-based ADAS functions;HMI for ADAS;Vehicle-hardware-in-the-loop validation system

    Efficient and accurate stereo matching for cloth manipulation

    Get PDF
    Due to the recent development of robotic techniques, researching robots that can assist in everyday household tasks, especially robotic cloth manipulation has become popular in recent years. Stereo matching forms a crucial part of the robotic vision and aims to derive depth information from image pairs captured by the stereo cameras. Although stereo robotic vision is widely adopted for cloth manipulation robots in the research community, this remains a challenging research task. Robotic vision requires very accurate depth output in a relatively short timespan in order to successfully perform cloth manipulation in real-time. In this thesis, we mainly aim to develop a robotic stereo matching based vision system that is both efficient and effective for the task of robotic cloth manipulation. Effectiveness refers to the accuracy of the depth map generated from the stereo matching algorithms for the robot to grasp the required details to achieve the given task on cloth materials while efficiency emphasizes the required time for the stereo matching to process the images. With respect to efficiency, firstly, by exploring a variety of different hardware architectures such as multi-core CPU and graphic processors (GPU) to accelerate stereo matching, we demonstrate that the parallelised stereo-matching algorithm can be significantly accelerated, achieving 12X and 176X speed-ups respectively for multi-core CPU and GPU, compared with SISD (Single Instruction, Single Data) single-thread CPU. In terms of effectiveness, due to the fact that there are no cloth based testbeds with depth map ground-truths for evaluating the accuracy of stereo matching performance in this context, we created five different testbeds to facilitate evaluation of stereo matching in the context of cloth manipulation. In addition, we adapted a guided filtering algorithm into a pyramidical stereo matching framework that works directly for unrectified images, and evaluate its accuracy utilizing the created cloth testbeds. We demonstrate that our proposed approach is not only efficient, but also accurate and suits well to the characteristics of the task of cloth manipulations. This also shows that rather than relying on image rectification, directly applying stereo matching to unrectified images is effective and efficient. Finally, we further explore whether we can improve efficiency while maintaining reasonable accuracy for robotic cloth manipulations (i.e.~trading off accuracy for efficiency). We use a foveated matching algorithm, inspired by biological vision systems, and found that it is effective in trading off accuracy for efficiency, achieving almost the same level of accuracy for both cloth grasping and flattening tasks with two to three fold acceleration. We also demonstrate that with the robot we can use machine learning techniques to predict the optimal foveation level in order to accomplish the robotic cloth manipulation tasks successfully and much more efficiently. To summarize, in this thesis, we extensively study stereo matching, contributing to the long-term goal of developing effective ways for efficient whilst accurate robotic stereo matching for cloth manipulation

    Towards a Common Software/Hardware Methodology for Future Advanced Driver Assistance Systems

    Get PDF
    The European research project DESERVE (DEvelopment platform for Safe and Efficient dRiVE, 2012-2015) had the aim of designing and developing a platform tool to cope with the continuously increasing complexity and the simultaneous need to reduce cost for future embedded Advanced Driver Assistance Systems (ADAS). For this purpose, the DESERVE platform profits from cross-domain software reuse, standardization of automotive software component interfaces, and easy but safety-compliant integration of heterogeneous modules. This enables the development of a new generation of ADAS applications, which challengingly combine different functions, sensors, actuators, hardware platforms, and Human Machine Interfaces (HMI). This book presents the different results of the DESERVE project concerning the ADAS development platform, test case functions, and validation and evaluation of different approaches. The reader is invited to substantiate the content of this book with the deliverables published during the DESERVE project. Technical topics discussed in this book include:Modern ADAS development platforms;Design space exploration;Driving modelling;Video-based and Radar-based ADAS functions;HMI for ADAS;Vehicle-hardware-in-the-loop validation system

    Unifying terrain awareness for the visually impaired through real-time semantic segmentation.

    Get PDF
    Navigational assistance aims to help visually-impaired people to ambulate the environment safely and independently. This topic becomes challenging as it requires detecting a wide variety of scenes to provide higher level assistive awareness. Vision-based technologies with monocular detectors or depth sensors have sprung up within several years of research. These separate approaches have achieved remarkable results with relatively low processing time and have improved the mobility of impaired people to a large extent. However, running all detectors jointly increases the latency and burdens the computational resources. In this paper, we put forward seizing pixel-wise semantic segmentation to cover navigation-related perception needs in a unified way. This is critical not only for the terrain awareness regarding traversable areas, sidewalks, stairs and water hazards, but also for the avoidance of short-range obstacles, fast-approaching pedestrians and vehicles. The core of our unification proposal is a deep architecture, aimed at attaining efficient semantic understanding. We have integrated the approach in a wearable navigation system by incorporating robust depth segmentation. A comprehensive set of experiments prove the qualified accuracy over state-of-the-art methods while maintaining real-time speed. We also present a closed-loop field test involving real visually-impaired users, demonstrating the effectivity and versatility of the assistive framework

    Learning from minimally labeled data with accelerated convolutional neural networks

    Get PDF
    The main objective of an Artificial Vision Algorithm is to design a mapping function that takes an image as an input and correctly classifies it into one of the user-determined categories. There are several important properties to be satisfied by the mapping function for visual understanding. First, the function should produce good representations of the visual world, which will be able to recognize images independently of pose, scale and illumination. Furthermore, the designed artificial vision system has to learn these representations by itself. Recent studies on Convolutional Neural Networks (ConvNets) produced promising advancements in visual understanding. These networks attain significant performance upgrades by relying on hierarchical structures inspired by biological vision systems. In my research, I work mainly in two areas: 1) how ConvNets can be programmed to learn the optimal mapping function using the minimum amount of labeled data, and 2) how these networks can be accelerated for practical purposes. In this work, algorithms that learn from unlabeled data are studied. A new framework that exploits unlabeled data is proposed. The proposed framework obtains state-of-the-art performance results in different tasks. Furthermore, this study presents an optimized streaming method for ConvNets’ hardware accelerator on an embedded platform. It is tested on object classification and detection applications using ConvNets. Experimental results indicate high computational efficiency, and significant performance upgrades over all other existing platforms

    Using graphics devices in reverse: GPU-based Image Processing and Computer Vision

    Full text link

    Selected Papers from IEEE ICASI 2019

    Get PDF
    The 5th IEEE International Conference on Applied System Innovation 2019 (IEEE ICASI 2019, https://2019.icasi-conf.net/), which was held in Fukuoka, Japan, on 11–15 April, 2019, provided a unified communication platform for a wide range of topics. This Special Issue entitled “Selected Papers from IEEE ICASI 2019” collected nine excellent papers presented on the applied sciences topic during the conference. Mechanical engineering and design innovations are academic and practical engineering fields that involve systematic technological materialization through scientific principles and engineering designs. Technological innovation by mechanical engineering includes information technology (IT)-based intelligent mechanical systems, mechanics and design innovations, and applied materials in nanoscience and nanotechnology. These new technologies that implant intelligence in machine systems represent an interdisciplinary area that combines conventional mechanical technology and new IT. The main goal of this Special Issue is to provide new scientific knowledge relevant to IT-based intelligent mechanical systems, mechanics and design innovations, and applied materials in nanoscience and nanotechnology

    About the development of visual search algorithms and their hardware implementations

    Get PDF
    2015 - 2016The main goal of my work is to exploit the benefits of a hardware implementation of a 3D visual search pipeline. The term visual search refers to the task of searching objects in the environment starting from the real world representation. Object recognition today is mainly based on scene descriptors, an unique description for special spots in the data structure. This task has been implemented traditionally for years using just plain images: an image descriptor is a feature vector used to describe a position in the images. Matching descriptors present in different viewing of the same scene should allows the same spot to be found from different angles, therefore a good descriptor should be robust with respect to changes in: scene luminosity, camera affine transformations (rotation, scale and translation), camera noise and object affine transformations. Clearly, by using 2D images it is not possible to be robust with respect to the change in the projective space, e.g. if the object is rotated with respect to the up camera axes its 2D projection will dramatically change. For this reason, alongside 2D descriptors, many techniques have been proposed to solve the projective transformation problem using 3D descriptors that allow to map the shape of the objects and consequently the surface real appearance. This category of descriptors relies on 3D Point Cloud and Disparity Map to build a reliable feature vector which is invariant to the projective transformation. More sophisticated techniques are needed to obtain the 3D representation of the scene and, if necessary, the texture of the 3D model and obviously these techniques are also more computationally intensive than the simple image capture. The field of 3D model acquisition is very broad, it is possible to distinguish between two main categories: active and passive methods. In the active methods category we can find special devices able to obtain 3D information projecting special light and. Generally an infrared projector is coupled with a camera: while the infrared light projects a well known and fixed pattern, the camera will receive the information of the patterns reflection on a certain surface and the distortion in the pattern will give the precise depth of every point in the scene. These kind of sensors are of i i “output” — 2017/6/22 — 18:23 — page 3 — #3 i i i i i i 3 course expensive and not very efficient from the power consumption point of view, since a lot of power is wasted projecting light and the use of lasers also imposes eye safety rules on frame rate and transmissed power. Another way to obtain 3D models is to use passive stereo vision techniques, where two (or more) cameras are required which only acquire the scene appearance. Using the two (or more) images as input for a stereo matching algorithm it is possible to reconstruct the 3D world. Since more computational resources will be needed for this task, hardware acceleration can give an impressive performance boost over pure software approach. In this work I will explore the principal steps of a visual search pipeline composed by a 3D vision and a 3D description system. Both systems will take advantage of a parallelized architecture prototyped in RTL and implemented on an FPGA platform. This is a huge research field and in this work I will try to explain the reason for all the choices I made for my implementation, e.g. chosen algorithms, applied heuristics to accelerate the performance and selected device. In the first chapter we explain the Visual Search issues, showing the main components required by a Visual Search pipeline. Then I show the implemented architecture for a stereo vision system based on a Bio-informatics inspired approach, where the final system can process up to 30fps at 1024 × 768 pixels. After that a clever method for boosting the performance of 3D descriptor is presented and as last chapter the final architecture for the SHOT descriptor on FPGA will be presented. [edited by author]L’obiettivo principale di questo lavoro e’ quello di esplorare i benefici di una implementazione hardware per una pipeline di visual search 3D. Il termine visual search si riferisce al problema di ricerca di oggetti nell’ambiente. L’object recognition ai giorni nostri e’ principalmente basato sull’uso di descrittori della scena, una descrizione univoca per i punti salienti. Questo compito e’ stato implementato per anni utilizzando immagini: il descrittore di un punto dell’immagine e’ un semplice vettore di caratteristiche. Accoppiando i descrittori presenti in differenti viste della stessa scena permette di trovare punti nello spazio visibili da entrambe le viste. Chiaramente, utilizzando immagini 2D non e’ possibile avere descrittori che sono robusti a cambiamenti della prospettiva, per questo motivo, molte tecniche sono state proposte per risolvere questo problema utilizzando descrittori 3D. Questa categoria di descrittori si avvale di 3D point cloud e mappe di disparita’. Ovviamente tecniche piu’ sofisticate sono necessarie per ottenere la rappresentazione 3D della scena. Il campo dell’acquisizione 3D e’ molto vasto ed e’ possibile distinguere tra due categorie di sensori: sensori attivi e passivi. Tra i sensori attivi possiamo annoverare dispositivi in grado di proiettare un pattern di luce infrarossa sulla scena, questo pattern noto presenta delle variazioni dovute agli oggetti presenti nella scena. Una camera infrarossi riceve l’immagine distorta del pattern e deduce la geometria della scena. Questo tipo di dispositivi non sono molto efficienti dal punto di vista energetico dato che un sacco di corrente viene consumata per proiettare il pattern. Un altro modo per ottenere un modello 3D e’ quello di usare sensori passivi, una coppia di telecamere puo’ essere utilizzata per ottenere informazioni utilizzando metodi di triangolazione. Questi metodi pero’ richiedono un sacco di potenza computazionale nel caso di applicazioni real time, per questo motivo e’ necessario utilizzare dispositivi ad-hoc quali architetture hardware dedicate implementate mediante l’uso di FPGA e ASIC. In questo lavoro ho esplorato gli step principali di una pipeline per la visual search composta da un sistema di visione 3D e uno per la descrizione di punti. Entrambi i sistemi si avvalgono di achitetture hardware dedicate prototipate in RTL e implementate su FPGA. Questo e’ un grosso campo di lavoro e provo ad esplorare i benefici di una implementazione harwadere per l’accelerazione degli algoritmi stessi e il risparmi di energia elettrica. [a cura dell'autore]XV n.s
    • …
    corecore