109 research outputs found

    Super-Resolution Approaches for Depth Video Enhancement

    Get PDF
    Sensing using 3D technologies has seen a revolution in the past years where cost-effective depth sensors are today part of accessible consumer electronics. Their ability in directly capturing depth videos in real-time has opened tremendous possibilities for multiple applications in computer vision. These sensors, however, have major shortcomings due to their high noise contamination, including missing and jagged measurements, and their low spatial resolutions. In order to extract detailed 3D features from this type of data, a dedicated data enhancement is required. We propose a generic depth multi–frame super–resolution framework that addresses the limitations of state-of-theart depth enhancement approaches. The proposed framework doesnot need any additional hardware or coupling with different modalities. It is based on a new data model that uses densely upsampled low resolution observations. This results in a robust median initial estimation, further refined by a deblurring operation using a bilateraltotal variation as the regularization term. The upsampling operation ensures a systematic improvement in the registration accuracy. This is explored in different scenarios based on the motions involved in the depth video. For the general and most challenging case of objects deforming non-rigidly in full 3D, we propose a recursive dynamic multi–frame super-resolution algorithm where the relative local 3D motions between consecutive frames are directly accounted for. We rely on the assumption that these 3D motions can be decoupled into lateral motions and radial displacements. This allows to perform a simple local per–pixel tracking where both depth measurements and deformations are optimized. As compared to alternative approaches, the results show a clear improvement in reconstruction accuracy and in robustness to noise, to relative large non-rigid deformations, and to topological changes. Moreover, the proposed approach, implemented on a CPU, is shown to be computationally efficient and working in real-time

    FULL 3D RECONSTRUCTION OF DYNAMIC NON-RIGID SCENES: ACQUISITION AND ENHANCEMENT

    Get PDF
    Recent advances in commodity depth or 3D sensing technologies have enabled us to move closer to the goal of accurately sensing and modeling the 3D representations of complex dynamic scenes. Indeed, in domains such as virtual reality, security, surveillance and e-health, there is now a greater demand for aff ordable and flexible vision systems which are capable of acquiring high quality 3D reconstructions. Available commodity RGB-D cameras, though easily accessible, have limited fi eld-of-view, and acquire noisy and low-resolution measurements which restricts their direct usage in building such vision systems. This thesis targets these limitations and builds approaches around commodity 3D sensing technologies to acquire noise-free and feature preserving full 3D reconstructions of dynamic scenes containing, static or moving, rigid or non-rigid objects. A mono-view system based on a single RGB-D camera is incapable of acquiring full 360 degrees 3D reconstruction of a dynamic scene instantaneously. For this purpose, a multi-view system composed of several RGB-D cameras covering the whole scene is used. In the first part of this thesis, the domain of correctly aligning the information acquired from RGB-D cameras in a multi-view system to provide full and textured 3D reconstructions of dynamic scenes, instantaneously, is explored. This is achieved by solving the extrinsic calibration problem. This thesis proposes an extrinsic calibration framework which uses the 2D photometric and 3D geometric information, acquired with RGB-D cameras, according to their relative (in)accuracies, a ffected by the presence of noise, in a single weighted bi-objective optimization. An iterative scheme is also proposed, which estimates the parameters of noise model aff ecting both 2D and 3D measurements, and solves the extrinsic calibration problem simultaneously. Results show improvement in calibration accuracy as compared to state-of-art methods. In the second part of this thesis, the domain of enhancement of noisy and low-resolution 3D data acquired with commodity RGB-D cameras in both mono-view and multi-view systems is explored. This thesis extends the state-of-art in mono-view template-free recursive 3D data enhancement which targets dynamic scenes containing rigid-objects, and thus requires tracking only the global motions of those objects for view-dependent surface representation and fi ltering. This thesis proposes to target dynamic scenes containing non-rigid objects which introduces the complex requirements of tracking relatively large local motions and maintaining data organization for view-dependent surface representation. The proposed method is shown to be e ffective in handling non-rigid objects of changing topologies. Building upon the previous work, this thesis overcomes the requirement of data organization by proposing an approach based on view-independent surface representation. View-independence decreases the complexity of the proposed algorithm and allows it the flexibility to process and enhance noisy data, acquired with multiple cameras in a multi-view system, simultaneously. Moreover, qualitative and quantitative experimental analysis shows this method to be more accurate in removing noise to produce enhanced 3D reconstructions of non-rigid objects. Although, extending this method to a multi-view system would allow for obtaining instantaneous enhanced full 360 degrees 3D reconstructions of non-rigid objects, it still lacks the ability to explicitly handle low-resolution data. Therefore, this thesis proposes a novel recursive dynamic multi-frame 3D super-resolution algorithm together with a novel 3D bilateral total variation regularization to filter out the noise, recover details and enhance the resolution of data acquired from commodity cameras in a multi-view system. Results show that this method is able to build accurate, smooth and feature preserving full 360 degrees 3D reconstructions of the dynamic scenes containing non-rigid objects

    Computational Imaging Approach to Recovery of Target Coordinates Using Orbital Sensor Data

    Get PDF
    This dissertation addresses the components necessary for simulation of an image-based recovery of the position of a target using orbital image sensors. Each component is considered in detail, focusing on the effect that design choices and system parameters have on the accuracy of the position estimate. Changes in sensor resolution, varying amounts of blur, differences in image noise level, selection of algorithms used for each component, and lag introduced by excessive processing time all contribute to the accuracy of the result regarding recovery of target coordinates using orbital sensor data. Using physical targets and sensors in this scenario would be cost-prohibitive in the exploratory setting posed, therefore a simulated target path is generated using Bezier curves which approximate representative paths followed by the targets of interest. Orbital trajectories for the sensors are designed on an elliptical model representative of the motion of physical orbital sensors. Images from each sensor are simulated based on the position and orientation of the sensor, the position of the target, and the imaging parameters selected for the experiment (resolution, noise level, blur level, etc.). Post-processing of the simulated imagery seeks to reduce noise and blur and increase resolution. The only information available for calculating the target position by a fully implemented system are the sensor position and orientation vectors and the images from each sensor. From these data we develop a reliable method of recovering the target position and analyze the impact on near-realtime processing. We also discuss the influence of adjustments to system components on overall capabilities and address the potential system size, weight, and power requirements from realistic implementation approaches

    Data-Driven Image Restoration

    Get PDF
    Every day many images are taken by digital cameras, and people are demanding visually accurate and pleasing result. Noise and blur degrade images captured by modern cameras, and high-level vision tasks (such as segmentation, recognition, and tracking) require high-quality images. Therefore, image restoration specifically, image deblurring and image denoising is a critical preprocessing step. A fundamental problem in image deblurring is to recover reliably distinct spatial frequencies that have been suppressed by the blur kernel. Existing image deblurring techniques often rely on generic image priors that only help recover part of the frequency spectrum, such as the frequencies near the high-end. To this end, we pose the following specific questions: (i) Does class-specific information offer an advantage over existing generic priors for image quality restoration? (ii) If a class-specific prior exists, how should it be encoded into a deblurring framework to recover attenuated image frequencies? Throughout this work, we devise a class-specific prior based on the band-pass filter responses and incorporate it into a deblurring strategy. Specifically, we show that the subspace of band-pass filtered images and their intensity distributions serve as useful priors for recovering image frequencies. Next, we present a novel image denoising algorithm that uses external, category specific image database. In contrast to existing noisy image restoration algorithms, our method selects clean image “support patches” similar to the noisy patch from an external database. We employ a content adaptive distribution model for each patch where we derive the parameters of the distribution from the support patches. Our objective function composed of a Gaussian fidelity term that imposes category specific information, and a low-rank term that encourages the similarity between the noisy and the support patches in a robust manner. Finally, we propose to learn a fully-convolutional network model that consists of a Chain of Identity Mapping Modules (CIMM) for image denoising. The CIMM structure possesses two distinctive features that are important for the noise removal task. Firstly, each residual unit employs identity mappings as the skip connections and receives pre-activated input to preserve the gradient magnitude propagated in both the forward and backward directions. Secondly, by utilizing dilated kernels for the convolution layers in the residual branch, each neuron in the last convolution layer of each module can observe the full receptive field of the first layer

    Line-Field Based Adaptive Image Model for Blind Deblurring

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Artificial Intelligence in the Creative Industries: A Review

    Full text link
    This paper reviews the current state of the art in Artificial Intelligence (AI) technologies and applications in the context of the creative industries. A brief background of AI, and specifically Machine Learning (ML) algorithms, is provided including Convolutional Neural Network (CNNs), Generative Adversarial Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement Learning (DRL). We categorise creative applications into five groups related to how AI technologies are used: i) content creation, ii) information analysis, iii) content enhancement and post production workflows, iv) information extraction and enhancement, and v) data compression. We critically examine the successes and limitations of this rapidly advancing technology in each of these areas. We further differentiate between the use of AI as a creative tool and its potential as a creator in its own right. We foresee that, in the near future, machine learning-based AI will be adopted widely as a tool or collaborative assistant for creativity. In contrast, we observe that the successes of machine learning in domains with fewer constraints, where AI is the `creator', remain modest. The potential of AI (or its developers) to win awards for its original creations in competition with human creatives is also limited, based on contemporary technologies. We therefore conclude that, in the context of creative industries, maximum benefit from AI will be derived where its focus is human centric -- where it is designed to augment, rather than replace, human creativity

    Evolution-Operator-Based Single-Step Method for Image Processing

    Get PDF
    This work proposes an evolution-operator-based single-time-step method for image and signal processing. The key component of the proposed method is a local spectral evolution kernel (LSEK) that analytically integrates a class of evolution partial differential equations (PDEs). From the point of view PDEs, the LSEK provides the analytical solution in a single time step, and is of spectral accuracy, free of instability constraint. From the point of image/signal processing, the LSEK gives rise to a family of lowpass filters. These filters contain controllable time delay and amplitude scaling. The new evolution operator-based method is constructed by pointwise adaptation of anisotropy to the coefficients of the LSEK. The Perona-Malik-type of anisotropic diffusion schemes is incorporated in the LSEK for image denoising. A forward-backward diffusion process is adopted to the LSEK for image deblurring or sharpening. A coupled PDE system is modified for image edge detection. The resulting image edge is utilized for image enhancement. Extensive computer experiments are carried out to demonstrate the performance of the proposed method. The major advantages of the proposed method are its single-step solution and readiness for multidimensional data analysis
    corecore