463 research outputs found

    Visibility in underwater robotics: Benchmarking and single image dehazing

    Get PDF
    Dealing with underwater visibility is one of the most important challenges in autonomous underwater robotics. The light transmission in the water medium degrades images making the interpretation of the scene difficult and consequently compromising the whole intervention. This thesis contributes by analysing the impact of the underwater image degradation in commonly used vision algorithms through benchmarking. An online framework for underwater research that makes possible to analyse results under different conditions is presented. Finally, motivated by the results of experimentation with the developed framework, a deep learning solution is proposed capable of dehazing a degraded image in real time restoring the original colors of the image.Una de las dificultades más grandes de la robótica autónoma submarina es lidiar con la falta de visibilidad en imágenes submarinas. La transmisión de la luz en el agua degrada las imágenes dificultando el reconocimiento de objetos y en consecuencia la intervención. Ésta tesis se centra en el análisis del impacto de la degradación de las imágenes submarinas en algoritmos de visión a través de benchmarking, desarrollando un entorno de trabajo en la nube que permite analizar los resultados bajo diferentes condiciones. Teniendo en cuenta los resultados obtenidos con este entorno, se proponen métodos basados en técnicas de aprendizaje profundo para mitigar el impacto de la degradación de las imágenes en tiempo real introduciendo un paso previo que permita recuperar los colores originales

    Single Image Super-Resolution Using a Deep Encoder-Decoder Symmetrical Network with Iterative Back Projection

    Get PDF
    Image super-resolution (SR) usually refers to reconstructing a high resolution (HR) image from a low resolution (LR) image without losing high frequency details or reducing the image quality. Recently, image SR based on convolutional neural network (SRCNN) was proposed and has received much attention due to its end-to-end mapping simplicity and superior performance. This method, however, only using three convolution layers to learn the mapping from LR to HR, usually converges slowly and leads to the size of output image reducing significantly. To address these issues, in this work, we propose a novel deep encoder-decoder symmetrical neural network (DEDSN) for single image SR. This deep network is fully composed of symmetrical multiple layers of convolution and deconvolution and there is no pooling (down-sampling and up-sampling) operations in the whole network so that image details degradation occurred in traditional convolutional frameworks is prevented. Additionally, in view of the success of the iterative back projection (IBP) algorithm in image SR, we further combine DEDSN with IBP network realization in this work. The new DEDSN-IBP model introduces the down sampling version of the ground truth image and calculates the simulation error as the prior guidance. Experimental results on benchmark data sets demonstrate that the proposed DEDSN model can achieve better performance than SRCNN and the improved DEDSN-IBP outperforms the reported state-of-the-art methods

    Single Image Super-Resolution Using a Deep Encoder-Decoder Symmetrical Network with Iterative Back Projection

    Get PDF
    Image super-resolution (SR) usually refers to reconstructing a high resolution (HR) image from a low resolution (LR) image without losing high frequency details or reducing the image quality. Recently, image SR based on convolutional neural network (SRCNN) was proposed and has received much attention due to its end-to-end mapping simplicity and superior performance. This method, however, only using three convolution layers to learn the mapping from LR to HR, usually converges slowly and leads to the size of output image reducing significantly. To address these issues, in this work, we propose a novel deep encoder-decoder symmetrical neural network (DEDSN) for single image SR. This deep network is fully composed of symmetrical multiple layers of convolution and deconvolution and there is no pooling (down-sampling and up-sampling) operations in the whole network so that image details degradation occurred in traditional convolutional frameworks is prevented. Additionally, in view of the success of the iterative back projection (IBP) algorithm in image SR, we further combine DEDSN with IBP network realization in this work. The new DEDSN-IBP model introduces the down sampling version of the ground truth image and calculates the simulation error as the prior guidance. Experimental results on benchmark data sets demonstrate that the proposed DEDSN model can achieve better performance than SRCNN and the improved DEDSN-IBP outperforms the reported state-of-the-art methods

    Volumetric performance capture from minimal camera viewpoints

    Get PDF
    We present a convolutional autoencoder that enables high fidelity volumetric reconstructions of human performance to be captured from multi-view video comprising only a small set of camera views. Our method yields similar end-to-end reconstruction error to that of a probabilistic visual hull computed using significantly more (double or more) viewpoints. We use a deep prior implicitly learned by the autoencoder trained over a dataset of view-ablated multi-view video footage of a wide range of subjects and actions. This opens up the possibility of high-end volumetric performance capture in on-set and prosumer scenarios where time or cost prohibit a high witness camera count

    General Purpose Audio Effect Removal

    Get PDF
    Although the design and application of audio effects is well understood, the inverse problem of removing these effects is significantly more challenging and far less studied. Recently, deep learning has been applied to audio effect removal; however, existing approaches have focused on narrow formulations considering only one effect or source type at a time. In realistic scenarios, multiple effects are applied with varying source content. This motivates a more general task, which we refer to as general purpose audio effect removal. We developed a dataset for this task using five audio effects across four different sources and used it to train and evaluate a set of existing architectures. We found that no single model performed optimally on all effect types and sources. To address this, we introduced RemFX, an approach designed to mirror the compositionality of applied effects. We first trained a set of the best-performing effect-specific removal models and then leveraged an audio effect classification model to dynamically construct a graph of our models at inference. We found our approach to outperform single model baselines, although examples with many effects present remain challenging

    Holistic Attention-Fusion Adversarial Network for Single Image Defogging

    Full text link
    Adversarial learning-based image defogging methods have been extensively studied in computer vision due to their remarkable performance. However, most existing methods have limited defogging capabilities for real cases because they are trained on the paired clear and synthesized foggy images of the same scenes. In addition, they have limitations in preserving vivid color and rich textual details in defogging. To address these issues, we develop a novel generative adversarial network, called holistic attention-fusion adversarial network (HAAN), for single image defogging. HAAN consists of a Fog2Fogfree block and a Fogfree2Fog block. In each block, there are three learning-based modules, namely, fog removal, color-texture recovery, and fog synthetic, that are constrained each other to generate high quality images. HAAN is designed to exploit the self-similarity of texture and structure information by learning the holistic channel-spatial feature correlations between the foggy image with its several derived images. Moreover, in the fog synthetic module, we utilize the atmospheric scattering model to guide it to improve the generative quality by focusing on an atmospheric light optimization with a novel sky segmentation network. Extensive experiments on both synthetic and real-world datasets show that HAAN outperforms state-of-the-art defogging methods in terms of quantitative accuracy and subjective visual quality.Comment: 13 pages, 10 figure
    corecore