Search CORE

21 research outputs found

Deep Mean-Shift Priors for Image Restoration

Author: Bigdeli Siavash Arjomand
Favaro Paolo
Jin Meiguang
Zwicker Matthias
Publication venue
Publication date: 01/01/2017
Field of study

In this paper we introduce a natural image prior that directly represents a Gaussian-smoothed version of the natural image distribution. We include our prior in a formulation of image restoration as a Bayes estimator that also allows us to solve noise-blind image restoration problems. We show that the gradient of our prior corresponds to the mean-shift vector on the natural image distribution. In addition, we learn the mean-shift vector field using denoising autoencoders, and use it in a gradient descent approach to perform Bayes risk minimization. We demonstrate competitive results for noise-blind deblurring, super-resolution, and demosaicing.Comment: NIPS 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Bern Open Repository and Information System (BORIS)

Masked Vision-Language Transformers for Scene Text Recognition

Author: Peng Ying
Qi Weigang
Wu Jie
Zhang Jian
Zhang Shengming
Publication venue
Publication date: 09/11/2022
Field of study

Scene text recognition (STR) enables computers to recognize and read the text in various real-world scenes. Recent STR models benefit from taking linguistic information in addition to visual cues into consideration. We propose a novel Masked Vision-Language Transformers (MVLT) to capture both the explicit and the implicit linguistic information. Our encoder is a Vision Transformer, and our decoder is a multi-modal Transformer. MVLT is trained in two stages: in the first stage, we design a STR-tailored pretraining method based on a masking strategy; in the second stage, we fine-tune our model and adopt an iterative correction method to improve the performance. MVLT attains superior results compared to state-of-the-art STR models on several benchmarks. Our code and model are available at https://github.com/onealwj/MVLT.Comment: The paper is accepted by the 33rd British Machine Vision Conference (BMVC 2022

arXiv.org e-Print Archive

CPO: Change Robust Panorama to Point Cloud Localization

Author: Choi Changwoon
Jang Hojun
Kim Junho
Kim Young Min
Publication venue
Publication date: 12/07/2022
Field of study

We present CPO, a fast and robust algorithm that localizes a 2D panorama with respect to a 3D point cloud of a scene possibly containing changes. To robustly handle scene changes, our approach deviates from conventional feature point matching, and focuses on the spatial context provided from panorama images. Specifically, we propose efficient color histogram generation and subsequent robust localization using score maps. By utilizing the unique equivariance of spherical projections, we propose very fast color histogram generation for a large number of camera poses without explicitly rendering images for all candidate poses. We accumulate the regional consistency of the panorama and point cloud as 2D/3D score maps, and use them to weigh the input color values to further increase robustness. The weighted color distribution quickly finds good initial poses and achieves stable convergence for gradient-based optimization. CPO is lightweight and achieves effective localization in all tested scenarios, showing stable performance despite scene changes, repetitive structures, or featureless regions, which are typical challenges for visual localization with perspective cameras.Comment: Accepted to ECCV 202

arXiv.org e-Print Archive

RELLISUR: A Real Low-Light Image Super-Resolution Dataset

Author: Aakerberg Andreas
Moeslund Thomas B.
Nasrollahi Kamal
Publication venue
Publication date: 20/08/2021
Field of study

The RELLISUR dataset contains real low-light low-resolution images paired with normal-light high-resolution reference image counterparts. This dataset aims to fill the gap between low-light image enhancement and low-resolution image enhancement (Super-Resolution (SR)) which is currently only being addressed separately in the literature, even though the visibility of real-world images is often limited by both low-light and low-resolution. The dataset contains 12750 paired images of different resolutions and degrees of low-light illumination, to facilitate learning of deep-learning based models that can perform a direct mapping from degraded images with low visibility to high-quality detail rich images of high resolution

VBN

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY