10 research outputs found
Design of Sail-Assisted Unmanned Surface Vehicle Intelligent Control System
To achieve the wind sail-assisted function of the unmanned surface vehicle (USV), this work focuses on the design problems of the sail-assisted USV intelligent control systems (SUICS) and illustrates the implementation process of the SUICS. The SUICS consists of the communication system, the sensor system, the PC platform, and the lower machine platform. To make full use of the wind energy, in the SUICS, we propose the sail angle of attack automatic adjustment (Sail_4A) algorithm and present the realization flow for each subsystem of the SUICS. By using the test boat, the design and implementation of the SUICS are fulfilled systematically. Experiments verify the performance and effectiveness of our SUICS. The SUICS enhances the intelligent utility of sustainable wind energy for the sail-assisted USV significantly and plays a vital role in shipping energy-saving emission reduction requirements issued by International Maritime Organization (IMO)
Customized Nios II multi-cycle instructions to accelerate block-matching techniques
This study focuses on accelerating the optimization of motion estimation algorithms, which are widely used in video coding standards, by using both the paradigm based on Altera Custom Instructions as well as the efficient combination of SDRAM and On-Chip memory of Nios II processor. Firstly, a complete code profiling is carried out before the optimization in order to detect time leaking affecting the motion compensation algorithms. Then, a multi-cycle Custom Instruction which will be added to the specific embedded design is implemented. The approach deployed is based on optimizing SOC performance by using an efficient combination of On-Chip memory and SDRAM with regards to the reset vector, exception vector, stack, heap, read/write data (.rwdata), read only data (.rodata), and program text (.text) in the design. Furthermore, this approach aims to enhance the said algorithms by incorporating Custom Instructions in the Nios II ISA. Finally, the efficient combination of both methods is then developed to build the final embedded system. The present contribution thus facilitates motion coding for low-cost Soft-Core microprocessors, particularly the RISC architecture of Nios II implemented in FPGA. It enables us to construct an SOC which processes 50×50 @ 180 fps
FPGA-Based Multimodal Embedded Sensor System Integrating Low- and Mid-Level Vision
Motion estimation is a low-level vision task that is especially relevant due to its wide range of applications in the real world. Many of the best motion estimation algorithms include some of the features that are found in mammalians, which would demand huge computational resources and therefore are not usually available in real-time. In this paper we present a novel bioinspired sensor based on the synergy between optical flow and orthogonal variant moments. The bioinspired sensor has been designed for Very Large Scale Integration (VLSI) using properties of the mammalian cortical motion pathway. This sensor combines low-level primitives (optical flow and image moments) in order to produce a mid-level vision abstraction layer. The results are described trough experiments showing the validity of the proposed system and an analysis of the computational resources and performance of the applied algorithms
FPGA-Based Portable Ultrasound Scanning System with Automatic Kidney Detection
Bedsides diagnosis using portable ultrasound scanning (PUS) offering comfortable diagnosis with various clinical advantages, in general, ultrasound scanners suffer from a poor signal-to-noise ratio, and physicians who operate the device at point-of-care may not be adequately trained to perform high level diagnosis. Such scenarios can be eradicated by incorporating ambient intelligence in PUS. In this paper, we propose an architecture for a PUS system, whose abilities include automated kidney detection in real time. Automated kidney detection is performed by training the Viola–Jones algorithm with a good set of kidney data consisting of diversified shapes and sizes. It is observed that the kidney detection algorithm delivers very good performance in terms of detection accuracy. The proposed PUS with kidney detection algorithm is implemented on a single Xilinx Kintex-7 FPGA, integrated with a Raspberry Pi ARM processor running at 900 MHz
Vehicle Routing Problems with Fuel Consumption and Stochastic Travel Speeds
Conventional vehicle routing problems (VRP) always assume that the vehicle travel speed is fixed or time-dependent on arcs. However, due to the uncertainty of weather, traffic conditions, and other random factors, it is not appropriate to set travel speeds to fixed constants in advance. Consequently, we propose a mathematic model for calculating expected fuel consumption and fixed vehicle cost where average speed is assumed to obey normal distribution on each arc which is more realistic than the existing model. For small-scaled problems, we make a linear transformation and solve them by existing solver CPLEX, while, for large-scaled problems, an improved simulated annealing (ISA) algorithm is constructed. Finally, instances from real road networks of England are performed with the ISA algorithm. Computational results show that our ISA algorithm performs well in a reasonable amount of time. We also find that when taking stochastic speeds into consideration, the fuel consumption is always larger than that with fixed speed model
Recommended from our members
ANALOG SIGNAL PROCESSING SOLUTIONS AND DESIGN OF MEMRISTOR-CMOS ANALOG CO-PROCESSOR FOR ACCELERATION OF HIGH-PERFORMANCE COMPUTING APPLICATIONS
Emerging applications in the field of machine vision, deep learning and scientific simulation require high computational speed and are run on platforms that are size, weight and power constrained. With the transistor scaling coming to an end, existing digital hardware architectures will not be able to meet these ever-increasing demands. Analog computation with its rich set of primitives and inherent parallel architecture can be faster, more efficient and compact for some of these applications. The major contribution of this work is to show that analog processing can be a viable solution to this problem. This is demonstrated in the three parts of the dissertation.
In the first part of the dissertation, we demonstrate that analog processing can be used to solve the problem of stereo correspondence. Novel modifications to the algorithms are proposed which improves the computational speed and makes them efficiently implementable in analog hardware. The analog domain implementation provides further speedup in computation and has lower power consumption than a digital implementation.
In the second part of the dissertation, a prototype of an analog processor was developed using commercially available off-the-shelf components. The focus was on providing experimental results that demonstrate functionality and to show that the performance of the prototype for low-level and mid-level image processing tasks is equivalent to a digital implementation. To demonstrate improvement in speed and power consumption, an integrated circuit design of the analog processor was proposed, and it was shown that such an analog processor would be faster than state-of-the-art digital and other analog processors.
In the third part of the dissertation, a memristor-CMOS analog co-processor that can perform floating point vector matrix multiplication (VMM) is proposed. VMM computation underlies some of the major applications. To demonstrate the working of the analog co-processor at a system level, a new tool called PSpice Systems Option is used. It is shown that the analog co-processor has a superior performance when compared to the projected performances of digital and analog processors. Using the new tool, various application simulations for image processing and solution to partial differential equations are performed on the co-processor model
Creación de un clúster de computación científica basado en FPGAs de bajo coste y consumo
En este trabajo se presenta la construcción de un clúster basado en FPGAs de bajo consumo energético y coste, capaz de ejecutar programas de alta complejidad, en el mismo o en menor tiempo que una estación de trabajo de mucho mayor coste y consumo. En la actualidad ya existen clústeres de este tipo, pero lo que diferencia al nuestro es que se han utilizado placas con FPGAs de bajas prestaciones y que se ha utilizado OpenCL como lenguaje de programación para acelerar la ejecución de los programas. Estas placas son las DE1-SOC de Altera y se caracterizan, aparte de por su bajo coste y consumo, por ser capaces de ejecutar un sistema operativo de base UNIX/Linux en su hard-core, un procesador ARM Cortex-A9 de dos núcleos. Sin embargo, las imágenes de UNIX/Linux disponibles tanto oficiales como no oficiales, presentan problemas de configuración o limitaciones. Debido a esto, se ha generado una imagen personalizada basada en Debian 8 y se ha instalado en ella el software necesario para poder ejecutar códigos escritos en OpenCL y compilados con el Kit de desarrollo de software de Intel para FPGAs. Se ha elegido esta distribución por ser muy utilizada, robusta y actualizada.
Además, se ha realizado una comparativa de los tiempos de ejecución, coste y consumo energético resultado de ejecutar un conjunto de 5 benchmarks, que hemos implementado en C y OpenCL, entre el clúster y una estación de trabajo o Workstation de altas prestaciones. Aunque en algunos casos los tiempos de ejecución de la Workstation han sido menores que los del clúster, el bajo consumo y coste de este último hace que su eficiencia energética sea mucho mejor que la de la Workstation y, por lo tanto, que sea una mejor opción