180,908 research outputs found
Real-Time Human Detection Using Deep Learning on Embedded Platforms: A Review
The detection of an object such as a human is very important for image understanding in the field of computer vision. Human detection in images can provide essential information for a wide variety of applications in intelligent systems. In this paper, human detection is carried out using deep learning that has developed rapidly and achieved extraordinary success in various object detection implementations. Recently, several embedded systems have emerged as powerful computing boards to provide high processing capabilities using the graphics processing unit (GPU). This paper aims to provide a comprehensive survey of the latest achievements in this field brought about by deep learning techniques in the embedded platforms. NVIDIA Jetson was chosen as a low power system designed to accelerate deep learning applications. This review highlights the performance of human detection models such as PedNet, multiped, SSD MobileNet V1, SSD MobileNet V2, and SSD inception V2 on edge computing. This survey aims to provide an overview of these methods and compare their performance in accuracy and computation time for real-time applications. The experimental results show that the SSD MobileNet V2 model provides the highest accuracy with the fastest computation time compared to other models in our video datasets with several scenarios
A Survey on Generative Diffusion Model
Deep learning shows excellent potential in generation tasks thanks to deep
latent representation. Generative models are classes of models that can
generate observations randomly concerning certain implied parameters. Recently,
the diffusion Model has become a rising class of generative models by its
power-generating ability. Nowadays, great achievements have been reached. More
applications except for computer vision, speech generation, bioinformatics, and
natural language processing are to be explored in this field. However, the
diffusion model has its genuine drawback of a slow generation process, single
data types, low likelihood, and the inability for dimension reduction. They are
leading to many enhanced works. This survey makes a summary of the field of the
diffusion model. We first state the main problem with two landmark works --
DDPM and DSM, and a unified landmark work -- Score SDE. Then, we present
improved techniques for existing problems in the diffusion-based model field,
including speed-up improvement For model speed-up improvement, data structure
diversification, likelihood optimization, and dimension reduction. Regarding
existing models, we also provide a benchmark of FID score, IS, and NLL
according to specific NFE. Moreover, applications with diffusion models are
introduced including computer vision, sequence modeling, audio, and AI for
science. Finally, there is a summarization of this field together with
limitations \& further directions. The summation of existing well-classified
methods is in our
Github:https://github.com/chq1155/A-Survey-on-Generative-Diffusion-Model
FastDepth: Fast Monocular Depth Estimation on Embedded Systems
Depth sensing is a critical function for robotic tasks such as localization,
mapping and obstacle detection. There has been a significant and growing
interest in depth estimation from a single RGB image, due to the relatively low
cost and size of monocular cameras. However, state-of-the-art single-view depth
estimation algorithms are based on fairly complex deep neural networks that are
too slow for real-time inference on an embedded platform, for instance, mounted
on a micro aerial vehicle. In this paper, we address the problem of fast depth
estimation on embedded systems. We propose an efficient and lightweight
encoder-decoder network architecture and apply network pruning to further
reduce computational complexity and latency. In particular, we focus on the
design of a low-latency decoder. Our methodology demonstrates that it is
possible to achieve similar accuracy as prior work on depth estimation, but at
inference speeds that are an order of magnitude faster. Our proposed network,
FastDepth, runs at 178 fps on an NVIDIA Jetson TX2 GPU and at 27 fps when using
only the TX2 CPU, with active power consumption under 10 W. FastDepth achieves
close to state-of-the-art accuracy on the NYU Depth v2 dataset. To the best of
the authors' knowledge, this paper demonstrates real-time monocular depth
estimation using a deep neural network with the lowest latency and highest
throughput on an embedded platform that can be carried by a micro aerial
vehicle.Comment: Accepted for presentation at ICRA 2019. 8 pages, 6 figures, 7 table
- …