262 research outputs found

    Spatial and Angular Resolution Enhancement of Light Fields Using Convolutional Neural Networks

    Get PDF
    Light field imaging extends the traditional photography by capturing both spatial and angular distribution of light, which enables new capabilities, including post-capture refocusing, post-capture aperture control, and depth estimation from a single shot. Micro-lens array (MLA) based light field cameras offer a cost-effective approach to capture light field. A major drawback of MLA based light field cameras is low spatial resolution, which is due to the fact that a single image sensor is shared to capture both spatial and angular information. In this paper, we present a learning based light field enhancement approach. Both spatial and angular resolution of captured light field is enhanced using convolutional neural networks. The proposed method is tested with real light field data captured with a Lytro light field camera, clearly demonstrating spatial and angular resolution improvement

    Towards Better Methods of Stereoscopic 3D Media Adjustment and Stylization

    Get PDF
    Stereoscopic 3D (S3D) media is pervasive in film, photography and art. However, working with S3D media poses a number of interesting challenges arising from capture and editing. In this thesis we address several of these challenges. In particular, we address disparity adjustment and present a layer-based method that can reduce disparity without distorting the scene. Our method was successfully used to repair several images for the 2014 documentary “Soldiers’ Stories” directed by Jonathan Kitzen. We then explore consistent and comfortable methods for stylizing stereo images. Our approach uses a modified version of the layer-based technique used for disparity adjustment and can be used with a variety of stylization filters, including those in Adobe Photoshop. We also present a disparity-aware painterly rendering algorithm. A user study concluded that our layer-based stylization method produced S3D images that were more comfortable than previous methods. Finally, we address S3D line drawing from S3D photographs. Line drawing is a common art style that our layer-based method is not able to reproduce. To improve the depth perception of our line drawings we optionally add stylized shading. An expert survey concluded that our results were comfortable and reproduced a sense of depth

    Graph Spectral Image Processing

    Full text link
    Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

    INCODE: Implicit Neural Conditioning with Prior Knowledge Embeddings

    Full text link
    Implicit Neural Representations (INRs) have revolutionized signal representation by leveraging neural networks to provide continuous and smooth representations of complex data. However, existing INRs face limitations in capturing fine-grained details, handling noise, and adapting to diverse signal types. To address these challenges, we introduce INCODE, a novel approach that enhances the control of the sinusoidal-based activation function in INRs using deep prior knowledge. INCODE comprises a harmonizer network and a composer network, where the harmonizer network dynamically adjusts key parameters of the activation function. Through a task-specific pre-trained model, INCODE adapts the task-specific parameters to optimize the representation process. Our approach not only excels in representation, but also extends its prowess to tackle complex tasks such as audio, image, and 3D shape reconstructions, as well as intricate challenges such as neural radiance fields (NeRFs), and inverse problems, including denoising, super-resolution, inpainting, and CT reconstruction. Through comprehensive experiments, INCODE demonstrates its superiority in terms of robustness, accuracy, quality, and convergence rate, broadening the scope of signal representation. Please visit the project's website for details on the proposed method and access to the code.Comment: Accepted at WACV 2024 conferenc

    INTERMEDIATE VIEW RECONSTRUCTION FOR MULTISCOPIC 3D DISPLAY

    Get PDF
    This thesis focuses on Intermediate View Reconstruction (IVR) which generates additional images from the available stereo images. The main application of IVR is to generate the content of multiscopic 3D displays, and it can be applied to generate different viewpoints to Free-viewpoint TV (FTV). Although IVR is considered a good approach to generate additional images, there are some problems with the reconstruction process, such as detecting and handling the occlusion areas, preserving the discontinuity at edges, and reducing image artifices through formation of the texture of the intermediate image. The occlusion area is defined as the visibility of such an area in one image and its disappearance in the other one. Solving IVR problems is considered a significant challenge for researchers. In this thesis, several novel algorithms have been specifically designed to solve IVR challenges by employing them in a highly robust intermediate view reconstruction algorithm. Computer simulation and experimental results confirm the importance of occluded areas in IVR. Therefore, we propose a novel occlusion detection algorithm and another novel algorithm to Inpaint those areas. Then, these proposed algorithms are employed in a novel occlusion-aware intermediate view reconstruction that finds an intermediate image with a given disparity between two input images. This novelty is addressed by adding occlusion awareness to the reconstruction algorithm and proposing three quality improvement techniques to reduce image artifices: filling the re-sampling holes, removing ghost contours, and handling the disocclusion area. We compared the proposed algorithms to the previously well-known algorithms on each field qualitatively and quantitatively. The obtained results show that our algorithms are superior to the previous well-known algorithms. The performance of the proposed reconstruction algorithm is tested under 13 real images and 13 synthetic images. Moreover, analysis of a human-trial experiment conducted with 21 participants confirmed that the reconstructed images from our proposed algorithm have very high quality compared with the reconstructed images from the other existing algorithms

    Fusing spatial and temporal components for real-time depth data enhancement of dynamic scenes

    Get PDF
    The depth images from consumer depth cameras (e.g., structured-light/ToF devices) exhibit a substantial amount of artifacts (e.g., holes, flickering, ghosting) that needs to be removed for real-world applications. Existing methods cannot entirely remove them and perform slow. This thesis proposes a new real-time spatio-temporal depth image enhancement filter that completely removes flickering and ghosting, and significantly reduces holes. This thesis also presents a novel depth-data capture setup and two data reduction methods to optimize the performance of the proposed enhancement method

    Synthesis of environment maps for mixed reality

    Get PDF
    When rendering virtual objects in a mixed reality application, it is helpful to have access to an environment map that captures the appearance of the scene from the perspective of the virtual object. It is straightforward to render virtual objects into such maps, but capturing and correctly rendering the real components of the scene into the map is much more challenging. This information is often recovered from physical light probes, such as reflective spheres or fisheye cameras, placed at the location of the virtual object in the scene. For many application areas, however, real light probes would be intrusive or impractical. Ideally, all of the information necessary to produce detailed environment maps could be captured using a single device. We introduce a method using an RGBD camera and a small fisheye camera, contained in a single unit, to create environment maps at any location in an indoor scene. The method combines the output from both cameras to correct for their limited field of view and the displacement from the virtual object, producing complete environment maps suitable for rendering the virtual content in real time. Our method improves on previous probeless approaches by its ability to recover high-frequency environment maps. We demonstrate how this can be used to render virtual objects which shadow, reflect and refract their environment convincingly
    • …
    corecore