1,187 research outputs found

    Real-time on-board obstacle avoidance for UAVs based on embedded stereo vision

    Get PDF
    In order to improve usability and safety, modern unmanned aerial vehicles (UAVs) are equipped with sensors to monitor the environment, such as laser-scanners and cameras. One important aspect in this monitoring process is to detect obstacles in the flight path in order to avoid collisions. Since a large number of consumer UAVs suffer from tight weight and power constraints, our work focuses on obstacle avoidance based on a lightweight stereo camera setup. We use disparity maps, which are computed from the camera images, to locate obstacles and to automatically steer the UAV around them. For disparity map computation we optimize the well-known semi-global matching (SGM) approach for the deployment on an embedded FPGA. The disparity maps are then converted into simpler representations, the so called U-/V-Maps, which are used for obstacle detection. Obstacle avoidance is based on a reactive approach which finds the shortest path around the obstacles as soon as they have a critical distance to the UAV. One of the fundamental goals of our work was the reduction of development costs by closing the gap between application development and hardware optimization. Hence, we aimed at using high-level synthesis (HLS) for porting our algorithms, which are written in C/C++, to the embedded FPGA. We evaluated our implementation of the disparity estimation on the KITTI Stereo 2015 benchmark. The integrity of the overall realtime reactive obstacle avoidance algorithm has been evaluated by using Hardware-in-the-Loop testing in conjunction with two flight simulators.Comment: Accepted in the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Scienc

    High throughput spatial convolution filters on FPGAs

    Get PDF
    Digital signal processing (DSP) on field- programmable gate arrays (FPGAs) has long been appealing because of the inherent parallelism in these computations that can be easily exploited to accelerate such algorithms. FPGAs have evolved significantly to further enhance the mapping of these algorithms, included additional hard blocks, such as the DSP blocks found in modern FPGAs. Although these DSP blocks can offer more efficient mapping of DSP computations, they are primarily designed for 1-D filter structures. We present a study on spatial convolutional filter implementations on FPGAs, optimizing around the structure of the DSP blocks to offer high throughput while maintaining the coefficient flexibility that other published architectures usually sacrifice. We show that it is possible to implement large filters for large 4K resolution image frames at frame rates of 30–60 FPS, while maintaining functional flexibility

    Real-Time Dense Stereo Matching With ELAS on FPGA Accelerated Embedded Devices

    Full text link
    For many applications in low-power real-time robotics, stereo cameras are the sensors of choice for depth perception as they are typically cheaper and more versatile than their active counterparts. Their biggest drawback, however, is that they do not directly sense depth maps; instead, these must be estimated through data-intensive processes. Therefore, appropriate algorithm selection plays an important role in achieving the desired performance characteristics. Motivated by applications in space and mobile robotics, we implement and evaluate a FPGA-accelerated adaptation of the ELAS algorithm. Despite offering one of the best trade-offs between efficiency and accuracy, ELAS has only been shown to run at 1.5-3 fps on a high-end CPU. Our system preserves all intriguing properties of the original algorithm, such as the slanted plane priors, but can achieve a frame rate of 47fps whilst consuming under 4W of power. Unlike previous FPGA based designs, we take advantage of both components on the CPU/FPGA System-on-Chip to showcase the strategy necessary to accelerate more complex and computationally diverse algorithms for such low power, real-time systems.Comment: 8 pages, 7 figures, 2 table

    FPGA-accelerated machine learning inference as a service for particle physics computing

    Full text link
    New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that potentially requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) milliseconds with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600--700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.Comment: 16 pages, 14 figures, 2 table

    On the Resilience of RTL NN Accelerators: Fault Characterization and Mitigation

    Get PDF
    Machine Learning (ML) is making a strong resurgence in tune with the massive generation of unstructured data which in turn requires massive computational resources. Due to the inherently compute- and power-intensive structure of Neural Networks (NNs), hardware accelerators emerge as a promising solution. However, with technology node scaling below 10nm, hardware accelerators become more susceptible to faults, which in turn can impact the NN accuracy. In this paper, we study the resilience aspects of Register-Transfer Level (RTL) model of NN accelerators, in particular, fault characterization and mitigation. By following a High-Level Synthesis (HLS) approach, first, we characterize the vulnerability of various components of RTL NN. We observed that the severity of faults depends on both i) application-level specifications, i.e., NN data (inputs, weights, or intermediate), NN layers, and NN activation functions, and ii) architectural-level specifications, i.e., data representation model and the parallelism degree of the underlying accelerator. Second, motivated by characterization results, we present a low-overhead fault mitigation technique that can efficiently correct bit flips, by 47.3% better than state-of-the-art methods.Comment: 8 pages, 6 figure

    A Survey on FPGA-Based Sensor Systems: Towards Intelligent and Reconfigurable Low-Power Sensors for Computer Vision, Control and Signal Processing

    Get PDF
    The current trend in the evolution of sensor systems seeks ways to provide more accuracy and resolution, while at the same time decreasing the size and power consumption. The use of Field Programmable Gate Arrays (FPGAs) provides specific reprogrammable hardware technology that can be properly exploited to obtain a reconfigurable sensor system. This adaptation capability enables the implementation of complex applications using the partial reconfigurability at a very low-power consumption. For highly demanding tasks FPGAs have been favored due to the high efficiency provided by their architectural flexibility (parallelism, on-chip memory, etc.), reconfigurability and superb performance in the development of algorithms. FPGAs have improved the performance of sensor systems and have triggered a clear increase in their use in new fields of application. A new generation of smarter, reconfigurable and lower power consumption sensors is being developed in Spain based on FPGAs. In this paper, a review of these developments is presented, describing as well the FPGA technologies employed by the different research groups and providing an overview of future research within this field.The research leading to these results has received funding from the Spanish Government and European FEDER funds (DPI2012-32390), the Valencia Regional Government (PROMETEO/2013/085) and the University of Alicante (GRE12-17)

    FPGA-based real-time moving target detection system for unmanned aerial vehicle application

    Get PDF
    Moving target detection is the most common task for Unmanned Aerial Vehicle (UAV) to find and track object of interest from a bird's eye view in mobile aerial surveillance for civilian applications such as search and rescue operation. The complex detection algorithm can be implemented in a real-time embedded system using Field Programmable Gate Array (FPGA). This paper presents the development of real-time moving target detection System-on-Chip (SoC) using FPGA for deployment on a UAV. The detection algorithm utilizes area-based image registration technique which includes motion estimation and object segmentation processes. The moving target detection system has been prototyped on a low-cost Terasic DE2-115 board mounted with TRDB-D5M camera. The system consists of Nios II processor and stream-oriented dedicated hardware accelerators running at 100 MHz clock rate, achieving 30-frame per second processing speed for 640 Ă— 480 pixels' resolution greyscale videos

    AIDI: An adaptive image denoising FPGA-based IP-core for real-time applications

    Get PDF
    The presence of noise in images can significantly impact the performances of digital image processing and computer vision algorithms. Thus, it should be removed to improve the robustness of the entire processing flow. The noise estimation in an image is also a key factor, since, to be more effective, algorithms and denoising filters should be tuned to the actual level of noise. Moreover, the complexity of these algorithms brings a new challenge in real-time image processing applications, requiring high computing capacity. In this context, hardware acceleration is crucial, and Field Programmable Gate Arrays (FPGAs) best fit the growing demand of computational capabilities. This paper presents an Adaptive Image Denoising IP-core (AIDI) for real-time applications. The core first estimates the level of noise in the input image, then applies an adaptive Gaussian smoothing filter to remove the estimated noise. The filtering parameters are computed on-the-fly, adapting them to the level of noise in the image, and pixel by pixel, to preserve image information (e.g., edges or corners). The FPGA-based architecture is presented, highlighting its improvements w.r.t. a standard static filtering approac
    • …
    corecore