214 research outputs found
Saliency-aware Stereoscopic Video Retargeting
Stereo video retargeting aims to resize an image to a desired aspect ratio.
The quality of retargeted videos can be significantly impacted by the stereo
videos spatial, temporal, and disparity coherence, all of which can be impacted
by the retargeting process. Due to the lack of a publicly accessible annotated
dataset, there is little research on deep learning-based methods for stereo
video retargeting. This paper proposes an unsupervised deep learning-based
stereo video retargeting network. Our model first detects the salient objects
and shifts and warps all objects such that it minimizes the distortion of the
salient parts of the stereo frames. We use 1D convolution for shifting the
salient objects and design a stereo video Transformer to assist the retargeting
process. To train the network, we use the parallax attention mechanism to fuse
the left and right views and feed the retargeted frames to a reconstruction
module that reverses the retargeted frames to the input frames. Therefore, the
network is trained in an unsupervised manner. Extensive qualitative and
quantitative experiments and ablation studies on KITTI stereo 2012 and 2015
datasets demonstrate the efficiency of the proposed method over the existing
state-of-the-art methods. The code is available at
https://github.com/z65451/SVR/.Comment: 8 pages excluding references. CVPRW conferenc
RRNet: Relational Reasoning Network with Parallel Multi-scale Attention for Salient Object Detection in Optical Remote Sensing Images
Salient object detection (SOD) for optical remote sensing images (RSIs) aims
at locating and extracting visually distinctive objects/regions from the
optical RSIs. Despite some saliency models were proposed to solve the intrinsic
problem of optical RSIs (such as complex background and scale-variant objects),
the accuracy and completeness are still unsatisfactory. To this end, we propose
a relational reasoning network with parallel multi-scale attention for SOD in
optical RSIs in this paper. The relational reasoning module that integrates the
spatial and the channel dimensions is designed to infer the semantic
relationship by utilizing high-level encoder features, thereby promoting the
generation of more complete detection results. The parallel multi-scale
attention module is proposed to effectively restore the detail information and
address the scale variation of salient objects by using the low-level features
refined by multi-scale attention. Extensive experiments on two datasets
demonstrate that our proposed RRNet outperforms the existing state-of-the-art
SOD competitors both qualitatively and quantitatively.Comment: 11 pages, 9 figures, Accepted by IEEE Transactions on Geoscience and
Remote Sensing 2021, project: https://rmcong.github.io/proj_RRNet.htm
- …