262 research outputs found
Spatial and Angular Resolution Enhancement of Light Fields Using Convolutional Neural Networks
Light field imaging extends the traditional photography by capturing both
spatial and angular distribution of light, which enables new capabilities,
including post-capture refocusing, post-capture aperture control, and depth
estimation from a single shot. Micro-lens array (MLA) based light field cameras
offer a cost-effective approach to capture light field. A major drawback of MLA
based light field cameras is low spatial resolution, which is due to the fact
that a single image sensor is shared to capture both spatial and angular
information. In this paper, we present a learning based light field enhancement
approach. Both spatial and angular resolution of captured light field is
enhanced using convolutional neural networks. The proposed method is tested
with real light field data captured with a Lytro light field camera, clearly
demonstrating spatial and angular resolution improvement
Towards Better Methods of Stereoscopic 3D Media Adjustment and Stylization
Stereoscopic 3D (S3D) media is pervasive in film, photography and art. However, working with
S3D media poses a number of interesting challenges arising from capture and editing. In this thesis
we address several of these challenges. In particular, we address disparity adjustment and present
a layer-based method that can reduce disparity without distorting the scene. Our method was
successfully used to repair several images for the 2014 documentary “Soldiers’ Stories” directed by
Jonathan Kitzen. We then explore consistent and comfortable methods for stylizing stereo images.
Our approach uses a modified version of the layer-based technique used for disparity adjustment
and can be used with a variety of stylization filters, including those in Adobe Photoshop. We
also present a disparity-aware painterly rendering algorithm. A user study concluded that our
layer-based stylization method produced S3D images that were more comfortable than previous
methods. Finally, we address S3D line drawing from S3D photographs. Line drawing is a common
art style that our layer-based method is not able to reproduce. To improve the depth perception of
our line drawings we optionally add stylized shading. An expert survey concluded that our results
were comfortable and reproduced a sense of depth
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
INCODE: Implicit Neural Conditioning with Prior Knowledge Embeddings
Implicit Neural Representations (INRs) have revolutionized signal
representation by leveraging neural networks to provide continuous and smooth
representations of complex data. However, existing INRs face limitations in
capturing fine-grained details, handling noise, and adapting to diverse signal
types. To address these challenges, we introduce INCODE, a novel approach that
enhances the control of the sinusoidal-based activation function in INRs using
deep prior knowledge. INCODE comprises a harmonizer network and a composer
network, where the harmonizer network dynamically adjusts key parameters of the
activation function. Through a task-specific pre-trained model, INCODE adapts
the task-specific parameters to optimize the representation process. Our
approach not only excels in representation, but also extends its prowess to
tackle complex tasks such as audio, image, and 3D shape reconstructions, as
well as intricate challenges such as neural radiance fields (NeRFs), and
inverse problems, including denoising, super-resolution, inpainting, and CT
reconstruction. Through comprehensive experiments, INCODE demonstrates its
superiority in terms of robustness, accuracy, quality, and convergence rate,
broadening the scope of signal representation. Please visit the project's
website for details on the proposed method and access to the code.Comment: Accepted at WACV 2024 conferenc
INTERMEDIATE VIEW RECONSTRUCTION FOR MULTISCOPIC 3D DISPLAY
This thesis focuses on Intermediate View Reconstruction (IVR) which generates additional images from the available stereo images. The main application of IVR is to generate the content of multiscopic 3D displays, and it can be applied to generate different viewpoints to Free-viewpoint TV (FTV). Although IVR is considered a good approach to generate additional images, there are some problems with the reconstruction process, such as detecting and handling the occlusion areas, preserving the discontinuity at edges, and reducing image artifices through formation of the texture of the intermediate image. The occlusion area is defined as the visibility of such an area in one image and its disappearance in the other one. Solving IVR problems is considered a significant challenge for researchers.
In this thesis, several novel algorithms have been specifically designed to solve IVR challenges by employing them in a highly robust intermediate view reconstruction
algorithm. Computer simulation and experimental results confirm the importance of occluded areas in IVR. Therefore, we propose a novel occlusion detection algorithm and another novel algorithm to Inpaint those areas. Then, these proposed algorithms are employed in a novel occlusion-aware intermediate view reconstruction that finds an intermediate image with a given disparity between two input images. This novelty is addressed by adding occlusion awareness to the reconstruction algorithm and proposing three quality improvement techniques to reduce image artifices: filling the re-sampling holes, removing ghost contours, and handling the disocclusion area.
We compared the proposed algorithms to the previously well-known algorithms on each field qualitatively and quantitatively. The obtained results show that our algorithms are superior to the previous well-known algorithms. The performance of the proposed reconstruction algorithm is tested under 13 real images and 13 synthetic images. Moreover, analysis of a human-trial experiment conducted with 21 participants confirmed that the reconstructed images from our proposed algorithm have very high quality compared with the reconstructed images from the other existing algorithms
Fusing spatial and temporal components for real-time depth data enhancement of dynamic scenes
The depth images from consumer depth cameras (e.g., structured-light/ToF devices) exhibit a substantial amount of artifacts (e.g., holes, flickering, ghosting) that needs to be removed for real-world applications. Existing methods cannot entirely remove them and perform slow. This thesis proposes a new real-time spatio-temporal depth image enhancement filter that completely removes flickering and ghosting, and significantly reduces holes. This thesis also presents a novel depth-data capture setup and two data reduction methods to optimize the performance of the proposed enhancement method
Synthesis of environment maps for mixed reality
When rendering virtual objects in a mixed reality application, it is helpful to have access to an environment map that captures the appearance of the scene from the perspective of the virtual object. It is straightforward to render virtual objects into such maps, but capturing and correctly rendering the real components of the scene into the map is much more challenging. This information is often recovered from physical light probes, such as reflective spheres or fisheye cameras, placed at the location of the virtual object in the scene. For many application areas, however, real light probes would be intrusive or impractical. Ideally, all of the information necessary to produce detailed environment maps could be captured using a single device. We introduce a method using an RGBD camera and a small fisheye camera, contained in a single unit, to create environment maps at any location in an indoor scene. The method combines the output from both cameras to correct for their limited field of view and the displacement from the virtual object, producing complete environment maps suitable for rendering the virtual content in real time. Our method improves on previous probeless approaches by its ability to recover high-frequency environment maps. We demonstrate how this can be used to render virtual objects which shadow, reflect and refract their environment convincingly
- …