3 research outputs found
Semantic Instance Meets Salient Object: Study on Video Semantic Salient Instance Segmentation
Focusing on only semantic instances that only salient in a scene gains more
benefits for robot navigation and self-driving cars than looking at all objects
in the whole scene. This paper pushes the envelope on salient regions in a
video to decompose them into semantically meaningful components, namely,
semantic salient instances. We provide the baseline for the new task of video
semantic salient instance segmentation (VSSIS), that is, Semantic Instance -
Salient Object (SISO) framework. The SISO framework is simple yet efficient,
leveraging advantages of two different segmentation tasks, i.e. semantic
instance segmentation and salient object segmentation to eventually fuse them
for the final result. In SISO, we introduce a sequential fusion by looking at
overlapping pixels between semantic instances and salient regions to have
non-overlapping instances one by one. We also introduce a recurrent instance
propagation to refine the shapes and semantic meanings of instances, and an
identity tracking to maintain both the identity and the semantic meaning of
instances over the entire video. Experimental results demonstrated the
effectiveness of our SISO baseline, which can handle occlusions in videos. In
addition, to tackle the task of VSSIS, we augment the DAVIS-2017 benchmark
dataset by assigning semantic ground-truth for salient instance labels,
obtaining SEmantic Salient Instance Video (SESIV) dataset. Our SESIV dataset
consists of 84 high-quality video sequences with pixel-wisely per-frame
ground-truth labels.Comment: accepted in WACV 201
Region-Based Multiscale Spatiotemporal Saliency for Video
Detecting salient objects from a video requires exploiting both spatial and
temporal knowledge included in the video. We propose a novel region-based
multiscale spatiotemporal saliency detection method for videos, where static
features and dynamic features computed from the low and middle levels are
combined together. Our method utilizes such combined features spatially over
each frame and, at the same time, temporally across frames using consistency
between consecutive frames. Saliency cues in our method are analyzed through a
multiscale segmentation model, and fused across scale levels, yielding to
exploring regions efficiently. An adaptive temporal window using motion
information is also developed to combine saliency values of consecutive frames
in order to keep temporal consistency across frames. Performance evaluation on
several popular benchmark datasets validates that our method outperforms
existing state-of-the-arts
Video Salient Object Detection Using Spatiotemporal Deep Features
This paper presents a method for detecting salient objects in videos where
temporal information in addition to spatial information is fully taken into
account. Following recent reports on the advantage of deep features over
conventional hand-crafted features, we propose a new set of SpatioTemporal Deep
(STD) features that utilize local and global contexts over frames. We also
propose new SpatioTemporal Conditional Random Field (STCRF) to compute saliency
from STD features. STCRF is our extension of CRF to the temporal domain and
describes the relationships among neighboring regions both in a frame and over
frames. STCRF leads to temporally consistent saliency maps over frames,
contributing to the accurate detection of salient objects' boundaries and noise
reduction during detection. Our proposed method first segments an input video
into multiple scales and then computes a saliency map at each scale level using
STD features with STCRF. The final saliency map is computed by fusing saliency
maps at different scale levels. Our experiments, using publicly available
benchmark datasets, confirm that the proposed method significantly outperforms
state-of-the-art methods. We also applied our saliency computation to the video
object segmentation task, showing that our method outperforms existing video
object segmentation methods.Comment: accepted at TI