1,838 research outputs found

    Photometric Depth Super-Resolution

    Full text link
    This study explores the use of photometric techniques (shape-from-shading and uncalibrated photometric stereo) for upsampling the low-resolution depth map from an RGB-D sensor to the higher resolution of the companion RGB image. A single-shot variational approach is first put forward, which is effective as long as the target's reflectance is piecewise-constant. It is then shown that this dependency upon a specific reflectance model can be relaxed by focusing on a specific class of objects (e.g., faces), and delegate reflectance estimation to a deep neural network. A multi-shot strategy based on randomly varying lighting conditions is eventually discussed. It requires no training or prior on the reflectance, yet this comes at the price of a dedicated acquisition setup. Both quantitative and qualitative evaluations illustrate the effectiveness of the proposed methods on synthetic and real-world scenarios.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2019. First three authors contribute equall

    Depth Enhancement and Surface Reconstruction with RGB/D Sequence

    Get PDF
    Surface reconstruction and 3D modeling is a challenging task, which has been explored for decades by the computer vision, computer graphics, and machine learning communities. It is fundamental to many applications such as robot navigation, animation and scene understanding, industrial control and medical diagnosis. In this dissertation, I take advantage of the consumer depth sensors for surface reconstruction. Considering its limited performance on capturing detailed surface geometry, a depth enhancement approach is proposed in the first place to recovery small and rich geometric details with captured depth and color sequence. In addition to enhancing its spatial resolution, I present a hybrid camera to improve the temporal resolution of consumer depth sensor and propose an optimization framework to capture high speed motion and generate high speed depth streams. Given the partial scans from the depth sensor, we also develop a novel fusion approach to build up complete and watertight human models with a template guided registration method. Finally, the problem of surface reconstruction for non-Lambertian objects, on which the current depth sensor fails, is addressed by exploiting multi-view images captured with a hand-held color camera and we propose a visual hull based approach to recovery the 3D model

    Recent Advances in Image Restoration with Applications to Real World Problems

    Get PDF
    In the past few decades, imaging hardware has improved tremendously in terms of resolution, making widespread usage of images in many diverse applications on Earth and planetary missions. However, practical issues associated with image acquisition are still affecting image quality. Some of these issues such as blurring, measurement noise, mosaicing artifacts, low spatial or spectral resolution, etc. can seriously affect the accuracy of the aforementioned applications. This book intends to provide the reader with a glimpse of the latest developments and recent advances in image restoration, which includes image super-resolution, image fusion to enhance spatial, spectral resolution, and temporal resolutions, and the generation of synthetic images using deep learning techniques. Some practical applications are also included

    A Deep Learning Framework in Selected Remote Sensing Applications

    Get PDF
    The main research topic is designing and implementing a deep learning framework applied to remote sensing. Remote sensing techniques and applications play a crucial role in observing the Earth evolution, especially nowadays, where the effects of climate change on our life is more and more evident. A considerable amount of data are daily acquired all over the Earth. Effective exploitation of this information requires the robustness, velocity and accuracy of deep learning. This emerging need inspired the choice of this topic. The conducted studies mainly focus on two European Space Agency (ESA) missions: Sentinel 1 and Sentinel 2. Images provided by the ESA Sentinel-2 mission are rapidly becoming the main source of information for the entire remote sensing community, thanks to their unprecedented combination of spatial, spectral and temporal resolution, as well as their open access policy. The increasing interest gained by these satellites in the research laboratory and applicative scenarios pushed us to utilize them in the considered framework. The combined use of Sentinel 1 and Sentinel 2 is crucial and very prominent in different contexts and different kinds of monitoring when the growing (or changing) dynamics are very rapid. Starting from this general framework, two specific research activities were identified and investigated, leading to the results presented in this dissertation. Both these studies can be placed in the context of data fusion. The first activity deals with a super-resolution framework to improve Sentinel 2 bands supplied at 20 meters up to 10 meters. Increasing the spatial resolution of these bands is of great interest in many remote sensing applications, particularly in monitoring vegetation, rivers, forests, and so on. The second topic of the deep learning framework has been applied to the multispectral Normalized Difference Vegetation Index (NDVI) extraction, and the semantic segmentation obtained fusing Sentinel 1 and S2 data. The S1 SAR data is of great importance for the quantity of information extracted in the context of monitoring wetlands, rivers and forests, and many other contexts. In both cases, the problem was addressed with deep learning techniques, and in both cases, very lean architectures were used, demonstrating that even without the availability of computing power, it is possible to obtain high-level results. The core of this framework is a Convolutional Neural Network (CNN). {CNNs have been successfully applied to many image processing problems, like super-resolution, pansharpening, classification, and others, because of several advantages such as (i) the capability to approximate complex non-linear functions, (ii) the ease of training that allows to avoid time-consuming handcraft filter design, (iii) the parallel computational architecture. Even if a large amount of "labelled" data is required for training, the CNN performances pushed me to this architectural choice.} In our S1 and S2 integration task, we have faced and overcome the problem of manually labelled data with an approach based on integrating these two different sensors. Therefore, apart from the investigation in Sentinel-1 and Sentinel-2 integration, the main contribution in both cases of these works is, in particular, the possibility of designing a CNN-based solution that can be distinguished by its lightness from a computational point of view and consequent substantial saving of time compared to more complex deep learning state-of-the-art solutions

    Exploring the Internal Statistics: Single Image Super-Resolution, Completion and Captioning

    Full text link
    Image enhancement has drawn increasingly attention in improving image quality or interpretability. It aims to modify images to achieve a better perception for human visual system or a more suitable representation for further analysis in a variety of applications such as medical imaging, remote sensing, and video surveillance. Based on different attributes of the given input images, enhancement tasks vary, e.g., noise removal, deblurring, resolution enhancement, prediction of missing pixels, etc. The latter two are usually referred to as image super-resolution and image inpainting (or completion). Image super-resolution and completion are numerically ill-posed problems. Multi-frame-based approaches make use of the presence of aliasing in multiple frames of the same scene. For cases where only one input image is available, it is extremely challenging to estimate the unknown pixel values. In this dissertation, we target at single image super-resolution and completion by exploring the internal statistics within the input image and across scales. An internal gradient similarity-based single image super-resolution algorithm is first presented. Then we demonstrate that the proposed framework could be naturally extended to accomplish super-resolution and completion simultaneously. Afterwards, a hybrid learning-based single image super-resolution approach is proposed to benefit from both external and internal statistics. This framework hinges on image-level hallucination from externally learned regression models as well as gradient level pyramid self-awareness for edges and textures refinement. The framework is then employed to break the resolution limitation of the passive microwave imagery and to boost the tracking accuracy of the sea ice movements. To extend our research to the quality enhancement of the depth maps, a novel system is presented to handle circumstances where only one pair of registered low-resolution intensity and depth images are available. High quality RGB and depth images are generated after the system. Extensive experimental results have demonstrated the effectiveness of all the proposed frameworks both quantitatively and qualitatively. Different from image super-resolution and completion which belong to low-level vision research, image captioning is a high-level vision task related to the semantic understanding of an input image. It is a natural task for human beings. However, image captioning remains challenging from a computer vision point of view especially due to the fact that the task itself is ambiguous. In principle, descriptions of an image can talk about any visual aspects in it varying from object attributes to scene features, or even refer to objects that are not depicted and the hidden interaction or connection that requires common sense knowledge to analyze. Therefore, learning-based image captioning is in general a data-driven task, which relies on the training dataset. Descriptions in the majority of the existing image-sentence datasets are generated by humans under specific instructions. Real-world sentence data is rarely directly utilized for training since it is sometimes noisy and unbalanced, which makes it ‘imperfect’ for the training of the image captioning task. In this dissertation, we present a novel image captioning framework to deal with the uncontrolled image-sentence dataset where descriptions could be strongly or weakly correlated to the image content and in arbitrary lengths. A self-guiding learning process is proposed to fully reveal the internal statistics of the training dataset and to look into the learning process in a global way and generate descriptions that are syntactically correct and semantically sound
    • …
    corecore