880 research outputs found

    RIBBONS: Rapid Inpainting Based on Browsing of Neighborhood Statistics

    Full text link
    Image inpainting refers to filling missing places in images using neighboring pixels. It also has many applications in different tasks of image processing. Most of these applications enhance the image quality by significant unwanted changes or even elimination of some existing pixels. These changes require considerable computational complexities which in turn results in remarkable processing time. In this paper we propose a fast inpainting algorithm called RIBBONS based on selection of patches around each missing pixel. This would accelerate the execution speed and the capability of online frame inpainting in video. The applied cost-function is a combination of statistical and spatial features in all neighboring pixels. We evaluate some candidate patches using the proposed cost function and minimize it to achieve the final patch. Experimental results show the higher speed of 'Ribbons' in comparison with previous methods while being comparable in terms of PSNR and SSIM for the images in MISC dataset

    A deep learning framework for quality assessment and restoration in video endoscopy

    Full text link
    Endoscopy is a routine imaging technique used for both diagnosis and minimally invasive surgical treatment. Artifacts such as motion blur, bubbles, specular reflections, floating objects and pixel saturation impede the visual interpretation and the automated analysis of endoscopy videos. Given the widespread use of endoscopy in different clinical applications, we contend that the robust and reliable identification of such artifacts and the automated restoration of corrupted video frames is a fundamental medical imaging problem. Existing state-of-the-art methods only deal with the detection and restoration of selected artifacts. However, typically endoscopy videos contain numerous artifacts which motivates to establish a comprehensive solution. We propose a fully automatic framework that can: 1) detect and classify six different primary artifacts, 2) provide a quality score for each frame and 3) restore mildly corrupted frames. To detect different artifacts our framework exploits fast multi-scale, single stage convolutional neural network detector. We introduce a quality metric to assess frame quality and predict image restoration success. Generative adversarial networks with carefully chosen regularization are finally used to restore corrupted frames. Our detector yields the highest mean average precision (mAP at 5% threshold) of 49.0 and the lowest computational time of 88 ms allowing for accurate real-time processing. Our restoration models for blind deblurring, saturation correction and inpainting demonstrate significant improvements over previous methods. On a set of 10 test videos we show that our approach preserves an average of 68.7% which is 25% more frames than that retained from the raw videos.Comment: 14 page

    Structure Preserving Large Imagery Reconstruction

    Get PDF
    With the explosive growth of web-based cameras and mobile devices, billions of photographs are uploaded to the internet. We can trivially collect a huge number of photo streams for various goals, such as image clustering, 3D scene reconstruction, and other big data applications. However, such tasks are not easy due to the fact the retrieved photos can have large variations in their view perspectives, resolutions, lighting, noises, and distortions. Fur-thermore, with the occlusion of unexpected objects like people, vehicles, it is even more challenging to find feature correspondences and reconstruct re-alistic scenes. In this paper, we propose a structure-based image completion algorithm for object removal that produces visually plausible content with consistent structure and scene texture. We use an edge matching technique to infer the potential structure of the unknown region. Driven by the estimated structure, texture synthesis is performed automatically along the estimated curves. We evaluate the proposed method on different types of images: from highly structured indoor environment to natural scenes. Our experimental results demonstrate satisfactory performance that can be potentially used for subsequent big data processing, such as image localization, object retrieval, and scene reconstruction. Our experiments show that this approach achieves favorable results that outperform existing state-of-the-art techniques

    Marker hiding methods: Applications in augmented reality

    Get PDF
    © 2015 Taylor & Francis Group, LLC.In augmented reality, the markers are noticeable by their simple design of a rectangular image with black and white areas that disturb the reality of the overall view. As the markerless techniques are not usually robust enough, hiding the markers has a valuable usage, which many researchers have focused on. Categorizing the marker hiding methods is the main motivation of this study, which explains each of them in detail and discusses the advantages and shortcomings of each. The main ideas, enhancements, and future works of the well-known techniques are also comprehensively summarized and analyzed in depth. The main goal of this study is to provide researchers who are interested in markerless or hiding-marker methods an easier approach for choosing the method that is best suited to their aims. This work reviews the different methods that hide the augmented reality marker by using information from its surrounding area. These methods have considerable differences in their smooth continuation of the textures that hide the marker area as well as their performance to hide the augmented reality marker in real time. It is also hoped that our analysis helps researchers find solutions to the drawbacks of each method. © 201

    Optimising Spatial and Tonal Data for PDE-based Inpainting

    Full text link
    Some recent methods for lossy signal and image compression store only a few selected pixels and fill in the missing structures by inpainting with a partial differential equation (PDE). Suitable operators include the Laplacian, the biharmonic operator, and edge-enhancing anisotropic diffusion (EED). The quality of such approaches depends substantially on the selection of the data that is kept. Optimising this data in the domain and codomain gives rise to challenging mathematical problems that shall be addressed in our work. In the 1D case, we prove results that provide insights into the difficulty of this problem, and we give evidence that a splitting into spatial and tonal (i.e. function value) optimisation does hardly deteriorate the results. In the 2D setting, we present generic algorithms that achieve a high reconstruction quality even if the specified data is very sparse. To optimise the spatial data, we use a probabilistic sparsification, followed by a nonlocal pixel exchange that avoids getting trapped in bad local optima. After this spatial optimisation we perform a tonal optimisation that modifies the function values in order to reduce the global reconstruction error. For homogeneous diffusion inpainting, this comes down to a least squares problem for which we prove that it has a unique solution. We demonstrate that it can be found efficiently with a gradient descent approach that is accelerated with fast explicit diffusion (FED) cycles. Our framework allows to specify the desired density of the inpainting mask a priori. Moreover, is more generic than other data optimisation approaches for the sparse inpainting problem, since it can also be extended to nonlinear inpainting operators such as EED. This is exploited to achieve reconstructions with state-of-the-art quality. We also give an extensive literature survey on PDE-based image compression methods

    From Augmentation to Inpainting:Improving Visual SLAM with Signal Enhancement Techniques and GAN-based Image Inpainting

    Get PDF
    This paper undertakes a comprehensive investigation that surpasses the conventional examination of signal enhancement techniques and their effects on visual Simultaneous Localization and Mapping (vSLAM) performance across diverse scenarios. Going beyond the conventional scope, the study extends its focus towards the seamless integration of signal enhancement techniques, aiming to achieve a substantial enhancement in the overall vSLAM performance. The research not only delves into the assessment of existing methods but also actively contributes to the field by proposing innovative denoising techniques that can play a pivotal role in refining the accuracy and reliability of vSLAM systems. This multifaceted approach encompasses a thorough exploration of the intricate relationships between signal enhancement, denoising strategies, their cumulative impact on the performance of vSLAM in real-world applications and the innovative use of Generative Adversarial Networks (GANs) for image inpainting. The GANs effectively fill in missing spaces following object detection and removal, presenting a novel state-of-the-art approach that significantly enhances overall accuracy and execution speed of vSLAM. This paper aims to contribute to the advancement of vSLAM algorithms in real-world scenarios, demonstrating improved accuracy, robustness, and computational efficiency through the amalgamation of signal enhancement and advanced denoising techniques

    EnsNet: Ensconce Text in the Wild

    Full text link
    A new method is proposed for removing text from natural images. The challenge is to first accurately localize text on the stroke-level and then replace it with a visually plausible background. Unlike previous methods that require image patches to erase scene text, our method, namely ensconce network (EnsNet), can operate end-to-end on a single image without any prior knowledge. The overall structure is an end-to-end trainable FCN-ResNet-18 network with a conditional generative adversarial network (cGAN). The feature of the former is first enhanced by a novel lateral connection structure and then refined by four carefully designed losses: multiscale regression loss and content loss, which capture the global discrepancy of different level features; texture loss and total variation loss, which primarily target filling the text region and preserving the reality of the background. The latter is a novel local-sensitive GAN, which attentively assesses the local consistency of the text erased regions. Both qualitative and quantitative sensitivity experiments on synthetic images and the ICDAR 2013 dataset demonstrate that each component of the EnsNet is essential to achieve a good performance. Moreover, our EnsNet can significantly outperform previous state-of-the-art methods in terms of all metrics. In addition, a qualitative experiment conducted on the SMBNet dataset further demonstrates that the proposed method can also preform well on general object (such as pedestrians) removal tasks. EnsNet is extremely fast, which can preform at 333 fps on an i5-8600 CPU device.Comment: 8 pages, 8 figures, 2 tables, accepted to appear in AAAI 201
    corecore