349 research outputs found
Deep Convolutional Neural Networks for Estimating Lens Distortion Parameters
In this paper we present a convolutional neural network (CNN) to predict multiple lens distortion parameters from a single input image. Unlike other methods, our network is suitable to create high resolution output as it directly estimates the parameters from the image which then can be used to rectify even very high resolution input images. As our method it is fully automatic, it is suitable for both casual creatives and professional artists. Our results show that our network accurately predicts the lens distortion parameters of high resolution images and corrects the distortions satisfactory
RecRecNet: Rectangling Rectified Wide-Angle Images by Thin-Plate Spline Model and DoF-based Curriculum Learning
The wide-angle lens shows appealing applications in VR technologies, but it
introduces severe radial distortion into its captured image. To recover the
realistic scene, previous works devote to rectifying the content of the
wide-angle image. However, such a rectification solution inevitably distorts
the image boundary, which changes related geometric distributions and misleads
the current vision perception models. In this work, we explore constructing a
win-win representation on both content and boundary by contributing a new
learning model, i.e., Rectangling Rectification Network (RecRecNet). In
particular, we propose a thin-plate spline (TPS) module to formulate the
non-linear and non-rigid transformation for rectangling images. By learning the
control points on the rectified image, our model can flexibly warp the source
structure to the target domain and achieves an end-to-end unsupervised
deformation. To relieve the complexity of structure approximation, we then
inspire our RecRecNet to learn the gradual deformation rules with a DoF (Degree
of Freedom)-based curriculum learning. By increasing the DoF in each curriculum
stage, namely, from similarity transformation (4-DoF) to homography
transformation (8-DoF), the network is capable of investigating more detailed
deformations, offering fast convergence on the final rectangling task.
Experiments show the superiority of our solution over the compared methods on
both quantitative and qualitative evaluations. The code and dataset are
available at https://github.com/KangLiao929/RecRecNet.Comment: Accepted to ICCV 202
Let's Enhance: A Deep Learning Approach to Extreme Deblurring of Text Images
This work presents a novel deep-learning-based pipeline for the inverse
problem of image deblurring, leveraging augmentation and pre-training with
synthetic data. Our results build on our winning submission to the recent
Helsinki Deblur Challenge 2021, whose goal was to explore the limits of
state-of-the-art deblurring algorithms in a real-world data setting. The task
of the challenge was to deblur out-of-focus images of random text, thereby in a
downstream task, maximizing an optical-character-recognition-based score
function. A key step of our solution is the data-driven estimation of the
physical forward model describing the blur process. This enables a stream of
synthetic data, generating pairs of ground-truth and blurry images on-the-fly,
which is used for an extensive augmentation of the small amount of challenge
data provided. The actual deblurring pipeline consists of an approximate
inversion of the radial lens distortion (determined by the estimated forward
model) and a U-Net architecture, which is trained end-to-end. Our algorithm was
the only one passing the hardest challenge level, achieving over
character recognition accuracy. Our findings are well in line with the paradigm
of data-centric machine learning, and we demonstrate its effectiveness in the
context of inverse problems. Apart from a detailed presentation of our
methodology, we also analyze the importance of several design choices in a
series of ablation studies. The code of our challenge submission is available
under https://github.com/theophil-trippe/HDC_TUBerlin_version_1.Comment: This article has been published in a revised form in Inverse Problems
and Imagin
SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning
In fisheye images, rich distinct distortion patterns are regularly
distributed in the image plane. These distortion patterns are independent of
the visual content and provide informative cues for rectification. To make the
best of such rectification cues, we introduce SimFIR, a simple framework for
fisheye image rectification based on self-supervised representation learning.
Technically, we first split a fisheye image into multiple patches and extract
their representations with a Vision Transformer (ViT). To learn fine-grained
distortion representations, we then associate different image patches with
their specific distortion patterns based on the fisheye model, and further
subtly design an innovative unified distortion-aware pretext task for their
learning. The transfer performance on the downstream rectification task is
remarkably boosted, which verifies the effectiveness of the learned
representations. Extensive experiments are conducted, and the quantitative and
qualitative results demonstrate the superiority of our method over the
state-of-the-art algorithms as well as its strong generalization ability on
real-world fisheye images.Comment: Accepted to ICCV 202
- …