3,921 research outputs found
Structure Preserving Large Imagery Reconstruction
With the explosive growth of web-based cameras and mobile devices, billions
of photographs are uploaded to the internet. We can trivially collect a huge
number of photo streams for various goals, such as image clustering, 3D scene
reconstruction, and other big data applications. However, such tasks are not
easy due to the fact the retrieved photos can have large variations in their
view perspectives, resolutions, lighting, noises, and distortions.
Fur-thermore, with the occlusion of unexpected objects like people, vehicles,
it is even more challenging to find feature correspondences and reconstruct
re-alistic scenes. In this paper, we propose a structure-based image completion
algorithm for object removal that produces visually plausible content with
consistent structure and scene texture. We use an edge matching technique to
infer the potential structure of the unknown region. Driven by the estimated
structure, texture synthesis is performed automatically along the estimated
curves. We evaluate the proposed method on different types of images: from
highly structured indoor environment to natural scenes. Our experimental
results demonstrate satisfactory performance that can be potentially used for
subsequent big data processing, such as image localization, object retrieval,
and scene reconstruction. Our experiments show that this approach achieves
favorable results that outperform existing state-of-the-art techniques
DIGITAL INPAINTING ALGORITHMS AND EVALUATION
Digital inpainting is the technique of filling in the missing regions of an image or a video using information from surrounding area. This technique has found widespread use in applications such as restoration, error recovery, multimedia editing, and video privacy protection. This dissertation addresses three significant challenges associated with the existing and emerging inpainting algorithms and applications. The three key areas of impact are 1) Structure completion for image inpainting algorithms, 2) Fast and efficient object based video inpainting framework and 3) Perceptual evaluation of large area image inpainting algorithms.
One of the main approach of existing image inpainting algorithms in completing the missing information is to follow a two stage process. A structure completion step, to complete the boundaries of regions in the hole area, followed by texture completion process using advanced texture synthesis methods. While the texture synthesis stage is important, it can be argued that structure completion aspect is a vital component in improving the perceptual image inpainting quality. To this end, we introduce a global structure completion algorithm for completion of missing boundaries using symmetry as the key feature. While existing methods for symmetry completion require a-priori information, our method takes a non-parametric approach by utilizing the invariant nature of curvature to complete missing boundaries. Turning our attention from image to video inpainting, we readily observe that existing video inpainting techniques have evolved as an extension of image inpainting techniques. As a result, they suffer from various shortcoming including, among others, inability to handle large missing spatio-temporal regions, significantly slow execution time making it impractical for interactive use and presence of temporal and spatial artifacts. To address these major challenges, we propose a fundamentally different method based on object based framework for improving the performance of video inpainting algorithms. We introduce a modular inpainting scheme in which we first segment the video into constituent objects by using acquired background models followed by inpainting of static background regions and dynamic foreground regions. For static background region inpainting, we use a simple background replacement and occasional image inpainting. To inpaint dynamic moving foreground regions, we introduce a novel sliding-window based dissimilarity measure in a dynamic programming framework. This technique can effectively inpaint large regions of occlusions, inpaint objects that are completely missing for several frames, change in size and pose and has minimal blurring and motion artifacts. Finally we direct our focus on experimental studies related to perceptual quality evaluation of large area image inpainting algorithms. The perceptual quality of large area inpainting technique is inherently a subjective process and yet no previous research has been carried out by taking the subjective nature of the Human Visual System (HVS). We perform subjective experiments using eye-tracking device involving 24 subjects to analyze the effect of inpainting on human gaze. We experimentally show that the presence of inpainting artifacts directly impacts the gaze of an unbiased observer and this in effect has a direct bearing on the subjective rating of the observer. Specifically, we show that the gaze energy in the hole regions of an inpainted image show marked deviations from normal behavior when the inpainting artifacts are readily apparent
Automatic Objects Removal for Scene Completion
With the explosive growth of web-based cameras and mobile devices, billions
of photographs are uploaded to the internet. We can trivially collect a huge
number of photo streams for various goals, such as 3D scene reconstruction and
other big data applications. However, this is not an easy task due to the fact
the retrieved photos are neither aligned nor calibrated. Furthermore, with the
occlusion of unexpected foreground objects like people, vehicles, it is even
more challenging to find feature correspondences and reconstruct realistic
scenes. In this paper, we propose a structure based image completion algorithm
for object removal that produces visually plausible content with consistent
structure and scene texture. We use an edge matching technique to infer the
potential structure of the unknown region. Driven by the estimated structure,
texture synthesis is performed automatically along the estimated curves. We
evaluate the proposed method on different types of images: from highly
structured indoor environment to the natural scenes. Our experimental results
demonstrate satisfactory performance that can be potentially used for
subsequent big data processing: 3D scene reconstruction and location
recognition.Comment: 6 pages, IEEE International Conference on Computer Communications
(INFOCOM 14), Workshop on Security and Privacy in Big Data, Toronto, Canada,
201
INTERMEDIATE VIEW RECONSTRUCTION FOR MULTISCOPIC 3D DISPLAY
This thesis focuses on Intermediate View Reconstruction (IVR) which generates additional images from the available stereo images. The main application of IVR is to generate the content of multiscopic 3D displays, and it can be applied to generate different viewpoints to Free-viewpoint TV (FTV). Although IVR is considered a good approach to generate additional images, there are some problems with the reconstruction process, such as detecting and handling the occlusion areas, preserving the discontinuity at edges, and reducing image artifices through formation of the texture of the intermediate image. The occlusion area is defined as the visibility of such an area in one image and its disappearance in the other one. Solving IVR problems is considered a significant challenge for researchers.
In this thesis, several novel algorithms have been specifically designed to solve IVR challenges by employing them in a highly robust intermediate view reconstruction
algorithm. Computer simulation and experimental results confirm the importance of occluded areas in IVR. Therefore, we propose a novel occlusion detection algorithm and another novel algorithm to Inpaint those areas. Then, these proposed algorithms are employed in a novel occlusion-aware intermediate view reconstruction that finds an intermediate image with a given disparity between two input images. This novelty is addressed by adding occlusion awareness to the reconstruction algorithm and proposing three quality improvement techniques to reduce image artifices: filling the re-sampling holes, removing ghost contours, and handling the disocclusion area.
We compared the proposed algorithms to the previously well-known algorithms on each field qualitatively and quantitatively. The obtained results show that our algorithms are superior to the previous well-known algorithms. The performance of the proposed reconstruction algorithm is tested under 13 real images and 13 synthetic images. Moreover, analysis of a human-trial experiment conducted with 21 participants confirmed that the reconstructed images from our proposed algorithm have very high quality compared with the reconstructed images from the other existing algorithms
Pareto Optimized Large Mask Approach for Efficient and Background Humanoid Shape Removal
The purpose of automated video object removal is to not only detect and remove the object of interest automatically, but also to utilize background context to inpaint the foreground area. Video inpainting requires to fill spatiotemporal gaps in a video with convincing material, necessitating both temporal and spatial consistency; the inpainted part must seamlessly integrate into the background in a variety of scenes, and it must maintain a consistent appearance in subsequent frames even if its surroundings change noticeably. We introduce deep learning-based methodology for removing unwanted human-like shapes in videos. The method uses Pareto-optimized Generative Adversarial Networks (GANs) technology, which is a novel contribution. The system automatically selects the Region of Interest (ROI) for each humanoid shape and uses a skeleton detection module to determine which humanoid shape to retain. The semantic masks of human like shapes are created using a semantic-aware occlusion-robust model that has four primary components: feature extraction, and local, global, and semantic branches. The global branch encodes occlusion-aware information to make the extracted features resistant to occlusion, while the local branch retrieves fine-grained local characteristics. A modified big mask inpainting approach is employed to eliminate a person from the image, leveraging Fast Fourier convolutions and utilizing polygonal chains and rectangles with unpredictable aspect ratios. The inpainter network takes the input image and the mask to create an output image excluding the background humanoid shapes. The generator uses an encoder-decoder structure with included skip connections to recover spatial information and dilated convolution and squeeze and excitation blocks to make the regions behind the humanoid shapes consistent with their surroundings. The discriminator avoids dissimilar structure at the patch scale, and the refiner network catches features around the boundaries of each background humanoid shape. The efficiency was assessed using the Structural Learned Perceptual Image Patch Similarity, Frechet Inception Distance, and Similarity Index Measure metrics and showed promising results in fully automated background person removal task. The method is evaluated on two video object segmentation datasets (DAVIS indicating respective values of 0.02, FID of 5.01 and SSIM of 0.79 and YouTube-VOS, resulting in 0.03, 6.22, 0.78 respectively) as well a database of 66 distinct video sequences of people behind a desk in an office environment (0.02, 4.01, and 0.78 respectively).publishedVersio
Recommended from our members
A Novel Inpainting Framework for Virtual View Synthesis
Multi-view imaging has stimulated significant research to enhance the user experience of free viewpoint video, allowing interactive navigation between views and the freedom to select a desired view to watch. This usually involves transmitting both textural and depth information captured from different viewpoints to the receiver, to enable the synthesis of an arbitrary view. In rendering these virtual views, perceptual holes can appear due to certain regions, hidden in the original view by a closer object, becoming visible in the virtual view. To provide a high quality experience these holes must be filled in a visually plausible way, in a process known as inpainting. This is challenging because the missing information is generally unknown and the hole-regions can be large. Recently depth-based inpainting techniques have been proposed to address this challenge and while these generally perform better than non-depth assisted methods, they are not very robust and can produce perceptual artefacts.
This thesis presents a new inpainting framework that innovatively exploits depth and textural self-similarity characteristics to construct subjectively enhanced virtual viewpoints. The framework makes three significant contributions to the field: i) the exploitation of view information to jointly inpaint textural and depth hole regions; ii) the introduction of the novel concept of self-similarity characterisation which is combined with relevant depth information; and iii) an advanced self-similarity characterising scheme that automatically determines key spatial transform parameters for effective and flexible inpainting.
The presented inpainting framework has been critically analysed and shown to provide superior performance both perceptually and numerically compared to existing techniques, especially in terms of lower visual artefacts. It provides a flexible robust framework to develop new inpainting strategies for the next generation of interactive multi-view technologies
- …