48,469 research outputs found
Deep Learning based Animal Detection and Tracking in Drone Video Footage
In this paper, we propose a multiple animal tracking system in drone footage that is designed and implemented using a Deep Neural Network (DNN) based tracking-by-detection approach. The proposed system consists of two main components, namely the sub-system for animal detection, and the sub-system for animal tracking. In the animal detection component, we exploit the effective use of YOLO-V5 to detect individual animals and in the tracking component, we use a centroid tracking algorithm to associate the location of the detected animals in consecutive video frames. The performance of the proposed system is analyzed on drone video footage containing herds of Arabian Oryx with complex patterns of movement of individual animals. All videos were recorded by using a drone flying over known oryx feeding points in the desert areas of the UAE. The experimental results showed that our tracking system can detect and track individual oryxes within herds, accurately, even when the oryxes are very close to each other, partially occluded and their walking paths cross each other
Continuous Gaze Tracking With Implicit Saliency-Aware Calibration on Mobile Devices
Gaze tracking is a useful human-to-computer interface, which plays an
increasingly important role in a range of mobile applications. Gaze calibration
is an indispensable component of gaze tracking, which transforms the eye
coordinates to the screen coordinates. The existing approaches of gaze tracking
either have limited accuracy or require the user's cooperation in calibration
and in turn hurt the quality of experience. We in this paper propose vGaze,
continuous gaze tracking with implicit saliency-aware calibration on mobile
devices. The design of vGaze stems from our insight on the temporal and spatial
dependent relation between the visual saliency and the user's gaze. vGaze is
implemented as a light-weight software that identifies video frames with
"useful" saliency information, sensing the user's head movement, performs
opportunistic calibration using only those "useful" frames, and leverages
historical information for accelerating saliency detection. We implement vGaze
on a commercial mobile device and evaluate its performance in various
scenarios. The results show that vGaze can work at real time with video
playback applications. The average error of gaze tracking is 1.51 cm (2.884
degree) which decreases to 0.99 cm (1.891 degree) with historical information
and 0.57 cm (1.089 degree) with an indicator
Object Detection in Videos with Tubelet Proposal Networks
Object detection in videos has drawn increasing attention recently with the
introduction of the large-scale ImageNet VID dataset. Different from object
detection in static images, temporal information in videos is vital for object
detection. To fully utilize temporal information, state-of-the-art methods are
based on spatiotemporal tubelets, which are essentially sequences of associated
bounding boxes across time. However, the existing methods have major
limitations in generating tubelets in terms of quality and efficiency.
Motion-based methods are able to obtain dense tubelets efficiently, but the
lengths are generally only several frames, which is not optimal for
incorporating long-term temporal information. Appearance-based methods, usually
involving generic object tracking, could generate long tubelets, but are
usually computationally expensive. In this work, we propose a framework for
object detection in videos, which consists of a novel tubelet proposal network
to efficiently generate spatiotemporal proposals, and a Long Short-term Memory
(LSTM) network that incorporates temporal information from tubelet proposals
for achieving high object detection accuracy in videos. Experiments on the
large-scale ImageNet VID dataset demonstrate the effectiveness of the proposed
framework for object detection in videos.Comment: CVPR 201
- …