40,920 research outputs found
Flow-Guided Feature Aggregation for Video Object Detection
Extending state-of-the-art object detectors from image to video is
challenging. The accuracy of detection suffers from degenerated object
appearances in videos, e.g., motion blur, video defocus, rare poses, etc.
Existing work attempts to exploit temporal information on box level, but such
methods are not trained end-to-end. We present flow-guided feature aggregation,
an accurate and end-to-end learning framework for video object detection. It
leverages temporal coherence on feature level instead. It improves the
per-frame features by aggregation of nearby features along the motion paths,
and thus improves the video recognition accuracy. Our method significantly
improves upon strong single-frame baselines in ImageNet VID, especially for
more challenging fast moving objects. Our framework is principled, and on par
with the best engineered systems winning the ImageNet VID challenges 2016,
without additional bells-and-whistles. The proposed method, together with Deep
Feature Flow, powered the winning entry of ImageNet VID challenges 2017. The
code is available at
https://github.com/msracver/Flow-Guided-Feature-Aggregation
Vision-based Real-Time Aerial Object Localization and Tracking for UAV Sensing System
The paper focuses on the problem of vision-based obstacle detection and
tracking for unmanned aerial vehicle navigation. A real-time object
localization and tracking strategy from monocular image sequences is developed
by effectively integrating the object detection and tracking into a dynamic
Kalman model. At the detection stage, the object of interest is automatically
detected and localized from a saliency map computed via the image background
connectivity cue at each frame; at the tracking stage, a Kalman filter is
employed to provide a coarse prediction of the object state, which is further
refined via a local detector incorporating the saliency map and the temporal
information between two consecutive frames. Compared to existing methods, the
proposed approach does not require any manual initialization for tracking, runs
much faster than the state-of-the-art trackers of its kind, and achieves
competitive tracking performance on a large number of image sequences.
Extensive experiments demonstrate the effectiveness and superior performance of
the proposed approach.Comment: 8 pages, 7 figure
Intelligent Adaptive Motion Control for Ground Wheeled Vehicles
In this paper a new intelligent adaptive control is applied to solve a problem of motion control of ground vehicles with two independent wheels actuated by a differential drive. The major objective of this work is to obtain a motion control system by using a new fuzzy inference mechanism where the Lyapunov’s stability can be assured. In particular the parameters of the kinematical control law are obtained using an intelligent Fuzzy mechanism, where the properties of the Fuzzy maps have been established to have the stability above. Due to the nonlinear map of the intelligent fuzzy inference mechanism (i.e. fuzzy rules and value of the rule), the parameters above are not constant, but, time after time, based on empirical fuzzy rules, they are updated in function of the values of the tracking errors. Since the fuzzy maps are adjusted based on the control performances, the parameters updating assures a robustness and fast convergence of the tracking errors. Also, since the vehicle dynamics and kinematics can be completely unknown, a dynamical and kinematical adaptive control is added. The proposed fuzzy controller has been implemented for a real nonholonomic electrical vehicle. Therefore system robustness and stability performance are verified through simulations and experimental studies
Discriminative Scale Space Tracking
Accurate scale estimation of a target is a challenging research problem in
visual object tracking. Most state-of-the-art methods employ an exhaustive
scale search to estimate the target size. The exhaustive search strategy is
computationally expensive and struggles when encountered with large scale
variations. This paper investigates the problem of accurate and robust scale
estimation in a tracking-by-detection framework. We propose a novel scale
adaptive tracking approach by learning separate discriminative correlation
filters for translation and scale estimation. The explicit scale filter is
learned online using the target appearance sampled at a set of different
scales. Contrary to standard approaches, our method directly learns the
appearance change induced by variations in the target scale. Additionally, we
investigate strategies to reduce the computational cost of our approach.
Extensive experiments are performed on the OTB and the VOT2014 datasets.
Compared to the standard exhaustive scale search, our approach achieves a gain
of 2.5% in average overlap precision on the OTB dataset. Additionally, our
method is computationally efficient, operating at a 50% higher frame rate
compared to the exhaustive scale search. Our method obtains the top rank in
performance by outperforming 19 state-of-the-art trackers on OTB and 37
state-of-the-art trackers on VOT2014.Comment: To appear in TPAMI. This is the journal extension of the
VOT2014-winning DSST tracking metho
Mesh-based video coding for low bit-rate communications
In this paper, a new method for low bit-rate content-adaptive mesh-based video coding is proposed. Intra-frame coding of this method employs feature map extraction for node distribution at specific threshold levels to achieve higher density placement of initial nodes for regions that contain high frequency features and conversely sparse placement of initial nodes for smooth regions. Insignificant nodes are largely removed using a subsequent node elimination scheme. The Hilbert scan is then applied before quantization and entropy coding to reduce amount of transmitted information. For moving images, both node position and color parameters of only a subset of nodes may change from frame to frame. It is sufficient to transmit only these changed parameters. The proposed method is well-suited for video coding at very low bit rates, as processing results demonstrate that it provides good subjective and objective image quality at a lower number of required bits
- …