880 research outputs found
RIBBONS: Rapid Inpainting Based on Browsing of Neighborhood Statistics
Image inpainting refers to filling missing places in images using neighboring
pixels. It also has many applications in different tasks of image processing.
Most of these applications enhance the image quality by significant unwanted
changes or even elimination of some existing pixels. These changes require
considerable computational complexities which in turn results in remarkable
processing time. In this paper we propose a fast inpainting algorithm called
RIBBONS based on selection of patches around each missing pixel. This would
accelerate the execution speed and the capability of online frame inpainting in
video. The applied cost-function is a combination of statistical and spatial
features in all neighboring pixels. We evaluate some candidate patches using
the proposed cost function and minimize it to achieve the final patch.
Experimental results show the higher speed of 'Ribbons' in comparison with
previous methods while being comparable in terms of PSNR and SSIM for the
images in MISC dataset
A deep learning framework for quality assessment and restoration in video endoscopy
Endoscopy is a routine imaging technique used for both diagnosis and
minimally invasive surgical treatment. Artifacts such as motion blur, bubbles,
specular reflections, floating objects and pixel saturation impede the visual
interpretation and the automated analysis of endoscopy videos. Given the
widespread use of endoscopy in different clinical applications, we contend that
the robust and reliable identification of such artifacts and the automated
restoration of corrupted video frames is a fundamental medical imaging problem.
Existing state-of-the-art methods only deal with the detection and restoration
of selected artifacts. However, typically endoscopy videos contain numerous
artifacts which motivates to establish a comprehensive solution.
We propose a fully automatic framework that can: 1) detect and classify six
different primary artifacts, 2) provide a quality score for each frame and 3)
restore mildly corrupted frames. To detect different artifacts our framework
exploits fast multi-scale, single stage convolutional neural network detector.
We introduce a quality metric to assess frame quality and predict image
restoration success. Generative adversarial networks with carefully chosen
regularization are finally used to restore corrupted frames.
Our detector yields the highest mean average precision (mAP at 5% threshold)
of 49.0 and the lowest computational time of 88 ms allowing for accurate
real-time processing. Our restoration models for blind deblurring, saturation
correction and inpainting demonstrate significant improvements over previous
methods. On a set of 10 test videos we show that our approach preserves an
average of 68.7% which is 25% more frames than that retained from the raw
videos.Comment: 14 page
Structure Preserving Large Imagery Reconstruction
With the explosive growth of web-based cameras and mobile devices, billions
of photographs are uploaded to the internet. We can trivially collect a huge
number of photo streams for various goals, such as image clustering, 3D scene
reconstruction, and other big data applications. However, such tasks are not
easy due to the fact the retrieved photos can have large variations in their
view perspectives, resolutions, lighting, noises, and distortions.
Fur-thermore, with the occlusion of unexpected objects like people, vehicles,
it is even more challenging to find feature correspondences and reconstruct
re-alistic scenes. In this paper, we propose a structure-based image completion
algorithm for object removal that produces visually plausible content with
consistent structure and scene texture. We use an edge matching technique to
infer the potential structure of the unknown region. Driven by the estimated
structure, texture synthesis is performed automatically along the estimated
curves. We evaluate the proposed method on different types of images: from
highly structured indoor environment to natural scenes. Our experimental
results demonstrate satisfactory performance that can be potentially used for
subsequent big data processing, such as image localization, object retrieval,
and scene reconstruction. Our experiments show that this approach achieves
favorable results that outperform existing state-of-the-art techniques
Marker hiding methods: Applications in augmented reality
© 2015 Taylor & Francis Group, LLC.In augmented reality, the markers are noticeable by their simple design of a rectangular image with black and white areas that disturb the reality of the overall view. As the markerless techniques are not usually robust enough, hiding the markers has a valuable usage, which many researchers have focused on. Categorizing the marker hiding methods is the main motivation of this study, which explains each of them in detail and discusses the advantages and shortcomings of each. The main ideas, enhancements, and future works of the well-known techniques are also comprehensively summarized and analyzed in depth. The main goal of this study is to provide researchers who are interested in markerless or hiding-marker methods an easier approach for choosing the method that is best suited to their aims. This work reviews the different methods that hide the augmented reality marker by using information from its surrounding area. These methods have considerable differences in their smooth continuation of the textures that hide the marker area as well as their performance to hide the augmented reality marker in real time. It is also hoped that our analysis helps researchers find solutions to the drawbacks of each method. © 201
Optimising Spatial and Tonal Data for PDE-based Inpainting
Some recent methods for lossy signal and image compression store only a few
selected pixels and fill in the missing structures by inpainting with a partial
differential equation (PDE). Suitable operators include the Laplacian, the
biharmonic operator, and edge-enhancing anisotropic diffusion (EED). The
quality of such approaches depends substantially on the selection of the data
that is kept. Optimising this data in the domain and codomain gives rise to
challenging mathematical problems that shall be addressed in our work.
In the 1D case, we prove results that provide insights into the difficulty of
this problem, and we give evidence that a splitting into spatial and tonal
(i.e. function value) optimisation does hardly deteriorate the results. In the
2D setting, we present generic algorithms that achieve a high reconstruction
quality even if the specified data is very sparse. To optimise the spatial
data, we use a probabilistic sparsification, followed by a nonlocal pixel
exchange that avoids getting trapped in bad local optima. After this spatial
optimisation we perform a tonal optimisation that modifies the function values
in order to reduce the global reconstruction error. For homogeneous diffusion
inpainting, this comes down to a least squares problem for which we prove that
it has a unique solution. We demonstrate that it can be found efficiently with
a gradient descent approach that is accelerated with fast explicit diffusion
(FED) cycles. Our framework allows to specify the desired density of the
inpainting mask a priori. Moreover, is more generic than other data
optimisation approaches for the sparse inpainting problem, since it can also be
extended to nonlinear inpainting operators such as EED. This is exploited to
achieve reconstructions with state-of-the-art quality.
We also give an extensive literature survey on PDE-based image compression
methods
From Augmentation to Inpainting:Improving Visual SLAM with Signal Enhancement Techniques and GAN-based Image Inpainting
This paper undertakes a comprehensive investigation that surpasses the conventional examination of signal enhancement techniques and their effects on visual Simultaneous Localization and Mapping (vSLAM) performance across diverse scenarios. Going beyond the conventional scope, the study extends its focus towards the seamless integration of signal enhancement techniques, aiming to achieve a substantial enhancement in the overall vSLAM performance. The research not only delves into the assessment of existing methods but also actively contributes to the field by proposing innovative denoising techniques that can play a pivotal role in refining the accuracy and reliability of vSLAM systems. This multifaceted approach encompasses a thorough exploration of the intricate relationships between signal enhancement, denoising strategies, their cumulative impact on the performance of vSLAM in real-world applications and the innovative use of Generative Adversarial Networks (GANs) for image inpainting. The GANs effectively fill in missing spaces following object detection and removal, presenting a novel state-of-the-art approach that significantly enhances overall accuracy and execution speed of vSLAM. This paper aims to contribute to the advancement of vSLAM algorithms in real-world scenarios, demonstrating improved accuracy, robustness, and computational efficiency through the amalgamation of signal enhancement and advanced denoising techniques
EnsNet: Ensconce Text in the Wild
A new method is proposed for removing text from natural images. The challenge
is to first accurately localize text on the stroke-level and then replace it
with a visually plausible background. Unlike previous methods that require
image patches to erase scene text, our method, namely ensconce network
(EnsNet), can operate end-to-end on a single image without any prior knowledge.
The overall structure is an end-to-end trainable FCN-ResNet-18 network with a
conditional generative adversarial network (cGAN). The feature of the former is
first enhanced by a novel lateral connection structure and then refined by four
carefully designed losses: multiscale regression loss and content loss, which
capture the global discrepancy of different level features; texture loss and
total variation loss, which primarily target filling the text region and
preserving the reality of the background. The latter is a novel local-sensitive
GAN, which attentively assesses the local consistency of the text erased
regions. Both qualitative and quantitative sensitivity experiments on synthetic
images and the ICDAR 2013 dataset demonstrate that each component of the EnsNet
is essential to achieve a good performance. Moreover, our EnsNet can
significantly outperform previous state-of-the-art methods in terms of all
metrics. In addition, a qualitative experiment conducted on the SMBNet dataset
further demonstrates that the proposed method can also preform well on general
object (such as pedestrians) removal tasks. EnsNet is extremely fast, which can
preform at 333 fps on an i5-8600 CPU device.Comment: 8 pages, 8 figures, 2 tables, accepted to appear in AAAI 201
- …