947 research outputs found
Cross-View Hierarchy Network for Stereo Image Super-Resolution
Stereo image super-resolution aims to improve the quality of high-resolution
stereo image pairs by exploiting complementary information across views. To
attain superior performance, many methods have prioritized designing complex
modules to fuse similar information across views, yet overlooking the
importance of intra-view information for high-resolution reconstruction. It
also leads to problems of wrong texture in recovered images. To address this
issue, we explore the interdependencies between various hierarchies from
intra-view and propose a novel method, named Cross-View-Hierarchy Network for
Stereo Image Super-Resolution (CVHSSR). Specifically, we design a
cross-hierarchy information mining block (CHIMB) that leverages channel
attention and large kernel convolution attention to extract both global and
local features from the intra-view, enabling the efficient restoration of
accurate texture details. Additionally, a cross-view interaction module (CVIM)
is proposed to fuse similar features from different views by utilizing
cross-view attention mechanisms, effectively adapting to the binocular scene.
Extensive experiments demonstrate the effectiveness of our method. CVHSSR
achieves the best stereo image super-resolution performance than other
state-of-the-art methods while using fewer parameters. The source code and
pre-trained models are available at https://github.com/AlexZou14/CVHSSR.Comment: 10 pages, 7 figures, CVPRW, NTIRE202
A New Dataset and Transformer for Stereoscopic Video Super-Resolution
Stereo video super-resolution (SVSR) aims to enhance the spatial resolution
of the low-resolution video by reconstructing the high-resolution video. The
key challenges in SVSR are preserving the stereo-consistency and
temporal-consistency, without which viewers may experience 3D fatigue. There
are several notable works on stereoscopic image super-resolution, but there is
little research on stereo video super-resolution. In this paper, we propose a
novel Transformer-based model for SVSR, namely Trans-SVSR. Trans-SVSR comprises
two key novel components: a spatio-temporal convolutional self-attention layer
and an optical flow-based feed-forward layer that discovers the correlation
across different video frames and aligns the features. The parallax attention
mechanism (PAM) that uses the cross-view information to consider the
significant disparities is used to fuse the stereo views. Due to the lack of a
benchmark dataset suitable for the SVSR task, we collected a new stereoscopic
video dataset, SVSR-Set, containing 71 full high-definition (HD) stereo videos
captured using a professional stereo camera. Extensive experiments on the
collected dataset, along with two other datasets, demonstrate that the
Trans-SVSR can achieve competitive performance compared to the state-of-the-art
methods. Project code and additional results are available at
https://github.com/H-deep/Trans-SVSR/Comment: Conference on Computer Vision and Pattern Recognition (CVPR 2022
Learning Parallax Transformer Network for Stereo Image JPEG Artifacts Removal
Under stereo settings, the performance of image JPEG artifacts removal can be
further improved by exploiting the additional information provided by a second
view. However, incorporating this information for stereo image JPEG artifacts
removal is a huge challenge, since the existing compression artifacts make
pixel-level view alignment difficult. In this paper, we propose a novel
parallax transformer network (PTNet) to integrate the information from stereo
image pairs for stereo image JPEG artifacts removal. Specifically, a
well-designed symmetric bi-directional parallax transformer module is proposed
to match features with similar textures between different views instead of
pixel-level view alignment. Due to the issues of occlusions and boundaries, a
confidence-based cross-view fusion module is proposed to achieve better feature
fusion for both views, where the cross-view features are weighted with
confidence maps. Especially, we adopt a coarse-to-fine design for the
cross-view interaction, leading to better performance. Comprehensive
experimental results demonstrate that our PTNet can effectively remove
compression artifacts and achieves superior performance than other testing
state-of-the-art methods.Comment: 11 pages, 12 figures, ACM MM202
Geometry based Three-Dimensional Image Processing Method for Electronic Cluster Eye
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI linkIn recent years, much attention has been paid to the electronic cluster eye (eCley), a new type of artificial compound eyes, because of its small size, wide field of view (FOV) and sensitivity to motion objects. An eCley is composed of a certain number of optical channels organized as an array. Each optical channel spans a small and fixed field of view (FOV). To obtain a complete image with a full FOV, the images from all the optical channels are required to be fused together. The parallax from unparallel neighboring optical channels in eCley may lead to reconstructed image blurring and incorrectly estimated depth. To solve this problem, this paper proposes a geometry based three-dimensional image processing method (G3D) for eCley to obtain a complete focused image and dense depth map. In G3D, we derive the geometry relationship of optical channels in eCley to obtain the mathematical relation between the parallax and depth among unparallel neighboring optical channels. Based on the geometry relationship, all of the optical channels are used to estimate the depth map and reconstruct a focused image. Subsequently, by using an edge-aware interpolation method, we can further gain a sharply focused image and a depth map. The effectiveness of the proposed method is verified by the experimental results
Perceptual modelling for 2D and 3D
Livrable D1.1 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D1.1 du projet
- …