552 research outputs found
Online Adaptation of Convolutional Neural Networks for Video Object Segmentation
We tackle the task of semi-supervised video object segmentation, i.e.
segmenting the pixels belonging to an object in the video using the ground
truth pixel mask for the first frame. We build on the recently introduced
one-shot video object segmentation (OSVOS) approach which uses a pretrained
network and fine-tunes it on the first frame. While achieving impressive
performance, at test time OSVOS uses the fine-tuned network in unchanged form
and is not able to adapt to large changes in object appearance. To overcome
this limitation, we propose Online Adaptive Video Object Segmentation (OnAVOS)
which updates the network online using training examples selected based on the
confidence of the network and the spatial configuration. Additionally, we add a
pretraining step based on objectness, which is learned on PASCAL. Our
experiments show that both extensions are highly effective and improve the
state of the art on DAVIS to an intersection-over-union score of 85.7%.Comment: Accepted at BMVC 2017. This version contains minor changes for the
camera ready versio
A Study of Exploiting Objectness for Robust Online Object Tracking
Tracking is a fundamental problem in many computer vision applications. Despite the progress over the last decade, there still exist many challenges especially when the problem is posed in real world scenarios (e.g., cluttered background, occluded objects). Among them drifting has been widely observed to be a problem common to the class of online tracking algorithms - i.e., when challenges such as occlusion or nonlinear deformation of the object occurs, the tracker might lose the target completely in subsequent frames in an image sequence. In this work, we propose to exploit the objectness to partially alleviate the drifting problem with the class of online object tracking and verify the effectiveness of this idea by extensive experimental results. More specifically, a recently developed objectness measure was incorporated into Incremental Learning for Visual Tracking (IVT) algorithm in a principled way. We have come up with a strategy of reinitializing the training samples in the proposed approach to improve the robustness of online tracking. Experimental results show that using objectness measure does help to alleviate its drift to background for certain challenging sequences
Object Recognition in Videos Utilizing Hierarchical and Temporal Objectness with Deep Neural Networks
This dissertation develops a novel system for object recognition in videos. The input of the system is a set of unconstrained videos containing a known set of objects. The output is the locations and categories for each object in each frame across all videos. Initially, a shot boundary detection algorithm is applied to the videos to divide them into multiple sequences separated by the identified shot boundaries. Since each of these sequences still contains moderate content variations, we further use a cost optimization-based key frame extraction method to select key frames in each sequence and use these key frames to divide the videos into shorter sub-sequences with little content variations. Next, we learn object proposals on the first frame of each sub-sequence. Building upon the state-of-the-art object detection algorithms, we develop a tree-based hierarchical model to improve the object detection. Using the learned object proposals as the initial object positions in the first frame of each sub-sequence, we apply the SPOT tracker to track the object proposals and re-rank them using the proposed temporal objectness to obtain object proposals tubes by removing unlikely objects. Finally, we employ the deep Convolution Neural Network (CNN) to perform classification on these tubes. Experiments show that the proposed system significantly improves the object detection rate of the learned proposals when comparing with some state-of-the-art object detectors. Due to the improvement in object detection, the proposed system also achieves higher mean average precision at the stage of proposal classification than the state-of-the-art methods
- …