85 research outputs found

    Saliency-aware Stereoscopic Video Retargeting

    Full text link
    Stereo video retargeting aims to resize an image to a desired aspect ratio. The quality of retargeted videos can be significantly impacted by the stereo videos spatial, temporal, and disparity coherence, all of which can be impacted by the retargeting process. Due to the lack of a publicly accessible annotated dataset, there is little research on deep learning-based methods for stereo video retargeting. This paper proposes an unsupervised deep learning-based stereo video retargeting network. Our model first detects the salient objects and shifts and warps all objects such that it minimizes the distortion of the salient parts of the stereo frames. We use 1D convolution for shifting the salient objects and design a stereo video Transformer to assist the retargeting process. To train the network, we use the parallax attention mechanism to fuse the left and right views and feed the retargeted frames to a reconstruction module that reverses the retargeted frames to the input frames. Therefore, the network is trained in an unsupervised manner. Extensive qualitative and quantitative experiments and ablation studies on KITTI stereo 2012 and 2015 datasets demonstrate the efficiency of the proposed method over the existing state-of-the-art methods. The code is available at https://github.com/z65451/SVR/.Comment: 8 pages excluding references. CVPRW conferenc

    Saliency detection for stereoscopic images

    Get PDF
    International audienceSaliency detection techniques have been widely used in various 2D multimedia processing applications. Currently, the emerging applications of stereoscopic display require new saliency detection models for stereoscopic images. Different from saliency detection for 2D images, depth features have to be taken into account in saliency detection for stereoscopic images. In this paper, we propose a new stereoscopic saliency detection framework based on the feature contrast of color, intensity, texture, and depth. Four types of features including color, luminance, texture, and depth are extracted from DC-T coefficients to represent the energy for image patches. A Gaussian model of the spatial distance between image patches is adopted for the consideration of local and global contrast calculation. A new fusion method is designed to combine the feature maps for computing the final saliency map for stereoscopic images. Experimental results on a recent eye tracking database show the superior performance of the proposed method over other existing ones in saliency estimation for 3D images

    Analysis of Disparity Maps for Detecting Saliency in Stereoscopic Video

    Full text link
    We present a system for automatically detecting salient image regions in stereoscopic videos. This report extends our previous system and provides additional details about its implementation. Our proposed algorithm considers information based on three dimensions: salient colors in individual frames, salient information derived from camera and object motion, and depth saliency. These three components are dynamically combined into one final saliency map based on the reliability of the individual saliency detectors. Such a combination allows using more efficient algorithms even if the quality of one detector degrades. For example, we use a computationally efficient stereo correspondence algorithm that might cause noisy disparity maps for certain scenarios. In this case, however, a more reliable saliency detection algorithm such as the image saliency is preferred. To evaluate the quality of the saliency detection, we created modified versions of stereoscopic videos with the non-salient regions blurred. Having users rate the quality of these videos, the results show that most users do not detect the blurred regions and that the automatic saliency detection is very reliable

    Stereoscopic image quality assessment method based on binocular combination saliency model

    Get PDF
    The objective quality assessment of stereoscopic images plays an important role in three-dimensional (3D) technologies. In this paper, we propose an effective method to evaluate the quality of stereoscopic images that are afflicted by symmetric distortions. The major technical contribution of this paper is that the binocular combination behaviours and human 3D visual saliency characteristics are both considered. In particular, a new 3D saliency map is developed, which not only greatly reduces the computational complexity by avoiding calculation of the depth information, but also assigns appropriate weights to the image contents. Experimental results indicate that the proposed metric not only significantly outperforms conventional 2D quality metrics, but also achieves higher performance than the existing 3D quality assessment models
    corecore