18,022 research outputs found
Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video
Object detection is considered one of the most challenging problems in this
field of computer vision, as it involves the combination of object
classification and object localization within a scene. Recently, deep neural
networks (DNNs) have been demonstrated to achieve superior object detection
performance compared to other approaches, with YOLOv2 (an improved You Only
Look Once model) being one of the state-of-the-art in DNN-based object
detection methods in terms of both speed and accuracy. Although YOLOv2 can
achieve real-time performance on a powerful GPU, it still remains very
challenging for leveraging this approach for real-time object detection in
video on embedded computing devices with limited computational power and
limited memory. In this paper, we propose a new framework called Fast YOLO, a
fast You Only Look Once framework which accelerates YOLOv2 to be able to
perform object detection in video on embedded devices in a real-time manner.
First, we leverage the evolutionary deep intelligence framework to evolve the
YOLOv2 network architecture and produce an optimized architecture (referred to
as O-YOLOv2 here) that has 2.8X fewer parameters with just a ~2% IOU drop. To
further reduce power consumption on embedded devices while maintaining
performance, a motion-adaptive inference method is introduced into the proposed
Fast YOLO framework to reduce the frequency of deep inference with O-YOLOv2
based on temporal motion characteristics. Experimental results show that the
proposed Fast YOLO framework can reduce the number of deep inferences by an
average of 38.13%, and an average speedup of ~3.3X for objection detection in
video compared to the original YOLOv2, leading Fast YOLO to run an average of
~18FPS on a Nvidia Jetson TX1 embedded system
- …