Search CORE

947 research outputs found

Cross-View Hierarchy Network for Stereo Image Super-Resolution

Author: Chen Liang
Gao Hongxia
Jiang Mingchao
Tan Ming
Yu Zhongxin
Zhang Yunchen
Zou Wenbin
Publication venue
Publication date: 12/04/2023
Field of study

Stereo image super-resolution aims to improve the quality of high-resolution stereo image pairs by exploiting complementary information across views. To attain superior performance, many methods have prioritized designing complex modules to fuse similar information across views, yet overlooking the importance of intra-view information for high-resolution reconstruction. It also leads to problems of wrong texture in recovered images. To address this issue, we explore the interdependencies between various hierarchies from intra-view and propose a novel method, named Cross-View-Hierarchy Network for Stereo Image Super-Resolution (CVHSSR). Specifically, we design a cross-hierarchy information mining block (CHIMB) that leverages channel attention and large kernel convolution attention to extract both global and local features from the intra-view, enabling the efficient restoration of accurate texture details. Additionally, a cross-view interaction module (CVIM) is proposed to fuse similar features from different views by utilizing cross-view attention mechanisms, effectively adapting to the binocular scene. Extensive experiments demonstrate the effectiveness of our method. CVHSSR achieves the best stereo image super-resolution performance than other state-of-the-art methods while using fewer parameters. The source code and pre-trained models are available at https://github.com/AlexZou14/CVHSSR.Comment: 10 pages, 7 figures, CVPRW, NTIRE202

arXiv.org e-Print Archive

A New Dataset and Transformer for Stereoscopic Video Super-Resolution

Author: Imani Hassan
Islam Md Baharul
Wong Lai-Kuan
Publication venue
Publication date: 01/01/2022
Field of study

Stereo video super-resolution (SVSR) aims to enhance the spatial resolution of the low-resolution video by reconstructing the high-resolution video. The key challenges in SVSR are preserving the stereo-consistency and temporal-consistency, without which viewers may experience 3D fatigue. There are several notable works on stereoscopic image super-resolution, but there is little research on stereo video super-resolution. In this paper, we propose a novel Transformer-based model for SVSR, namely Trans-SVSR. Trans-SVSR comprises two key novel components: a spatio-temporal convolutional self-attention layer and an optical flow-based feed-forward layer that discovers the correlation across different video frames and aligns the features. The parallax attention mechanism (PAM) that uses the cross-view information to consider the significant disparities is used to fuse the stereo views. Due to the lack of a benchmark dataset suitable for the SVSR task, we collected a new stereoscopic video dataset, SVSR-Set, containing 71 full high-definition (HD) stereo videos captured using a professional stereo camera. Extensive experiments on the collected dataset, along with two other datasets, demonstrate that the Trans-SVSR can achieve competitive performance compared to the state-of-the-art methods. Project code and additional results are available at https://github.com/H-deep/Trans-SVSR/Comment: Conference on Computer Vision and Pattern Recognition (CVPR 2022

arXiv.org e-Print Archive

SHDL@MMU Digital Repository

Learning Parallax Transformer Network for Stereo Image JPEG Artifacts Removal

Author: Cheng Ri
Jiang Xuhao
Tan Weimin
Yan Bo
Zhou Shili
Publication venue
Publication date: 15/07/2022
Field of study

Under stereo settings, the performance of image JPEG artifacts removal can be further improved by exploiting the additional information provided by a second view. However, incorporating this information for stereo image JPEG artifacts removal is a huge challenge, since the existing compression artifacts make pixel-level view alignment difficult. In this paper, we propose a novel parallax transformer network (PTNet) to integrate the information from stereo image pairs for stereo image JPEG artifacts removal. Specifically, a well-designed symmetric bi-directional parallax transformer module is proposed to match features with similar textures between different views instead of pixel-level view alignment. Due to the issues of occlusions and boundaries, a confidence-based cross-view fusion module is proposed to achieve better feature fusion for both views, where the cross-view features are weighted with confidence maps. Especially, we adopt a coarse-to-fine design for the cross-view interaction, leading to better performance. Comprehensive experimental results demonstrate that our PTNet can effectively remove compression artifacts and achieves superior performance than other testing state-of-the-art methods.Comment: 11 pages, 12 figures, ACM MM202

arXiv.org e-Print Archive

Geometry based Three-Dimensional Image Processing Method for Electronic Cluster Eye

Author: Jian T.
Neri Ferrante
Wu S.
Zhang G.
Zhu M.
Publication venue: 'IOS Press'
Publication date: 09/01/2018
Field of study

The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI linkIn recent years, much attention has been paid to the electronic cluster eye (eCley), a new type of artificial compound eyes, because of its small size, wide field of view (FOV) and sensitivity to motion objects. An eCley is composed of a certain number of optical channels organized as an array. Each optical channel spans a small and fixed field of view (FOV). To obtain a complete image with a full FOV, the images from all the optical channels are required to be fused together. The parallax from unparallel neighboring optical channels in eCley may lead to reconstructed image blurring and incorrectly estimated depth. To solve this problem, this paper proposes a geometry based three-dimensional image processing method (G3D) for eCley to obtain a complete focused image and dense depth map. In G3D, we derive the geometry relationship of optical channels in eCley to obtain the mathematical relation between the parallax and depth among unparallel neighboring optical channels. Based on the geometry relationship, all of the optical channels are used to estimate the depth map and reconstruct a focused image. Subsequently, by using an edge-aware interpolation method, we can further gain a sharply focused image and a depth map. The effectiveness of the proposed method is verified by the experimental results

De Montfort University Open Research Archive

Perceptual modelling for 2D and 3D

Author: Bosc Emilie
Gautier Josselin
Le Meur Olivier
Ricordel Vincent
Wang Junle
Publication venue: HAL CCSD
Publication date: 01/11/2010
Field of study

Livrable D1.1 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D1.1 du projet

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1