750 research outputs found
Semi-Supervised Video Salient Object Detection Using Pseudo-Labels
Deep learning-based video salient object detection has recently achieved
great success with its performance significantly outperforming any other
unsupervised methods. However, existing data-driven approaches heavily rely on
a large quantity of pixel-wise annotated video frames to deliver such promising
results. In this paper, we address the semi-supervised video salient object
detection task using pseudo-labels. Specifically, we present an effective video
saliency detector that consists of a spatial refinement network and a
spatiotemporal module. Based on the same refinement network and motion
information in terms of optical flow, we further propose a novel method for
generating pixel-level pseudo-labels from sparsely annotated frames. By
utilizing the generated pseudo-labels together with a part of manual
annotations, our video saliency detector learns spatial and temporal cues for
both contrast inference and coherence enhancement, thus producing accurate
saliency maps. Experimental results demonstrate that our proposed
semi-supervised method even greatly outperforms all the state-of-the-art fully
supervised methods across three public benchmarks of VOS, DAVIS, and FBMS.Comment: ICCV2019, code is available at
https://github.com/Kinpzz/RCRNet-Pytorc
Spatiotemporal Saliency Detection: State of Art
Saliency detection has become a very prominent subject for research in recent time. Many techniques has been defined for the saliency detection.In this paper number of techniques has been explained that include the saliency detection from the year 2000 to 2015, almost every technique has been included.all the methods are explained briefly including their advantages and disadvantages. Comparison between various techniques has been done. With the help of table which includes authors name,paper name,year,techniques,algorithms and challenges. A comparison between levels of acceptance rates and accuracy levels are made
Fine-grained action recognition by motion saliency and mid-level patches
Effective extraction of human body parts and operated objects participating in action is the key issue of fine-grained action recognition. However, most of the existing methods require intensive manual annotation to train the detectors of these interaction components. In this paper, we represent videos by mid-level patches to avoid the manual annotation, where each patch corresponds to an action-related interaction component. In order to capture mid-level patches more exactly and rapidly, candidate motion regions are extracted by motion saliency. Firstly, the motion regions containing interaction components are segmented by a threshold adaptively calculated according to the saliency histogram of the motion saliency map. Secondly, we introduce a mid-level patch mining algorithm for interaction component detection, with object proposal generation and mid-level patch detection. The object proposal generation algorithm is used to obtain multi-granularity object proposals inspired by the idea of the Huffman algorithm. Based on these object proposals, the mid-level patch detectors are trained by K-means clustering and SVM. Finally, we build a fine-grained action recognition model using a graph structure to describe relationships between the mid-level patches. To recognize actions, the proposed model calculates the appearance and motion features of mid-level patches and the binary motion cooperation relationships between adjacent patches in the graph. Extensive experiments on the MPII cooking database demonstrate that the proposed method gains better results on fine-grained action recognition
- …