463 research outputs found
Visibility in underwater robotics: Benchmarking and single image dehazing
Dealing with underwater visibility is one of the most important challenges in autonomous underwater robotics. The light transmission in the water medium degrades images making the interpretation of the scene difficult and consequently compromising the whole intervention. This thesis contributes by analysing the impact of the underwater image degradation in commonly used vision algorithms through benchmarking. An online framework for underwater research that makes possible to analyse results under different conditions is presented. Finally, motivated by the results of experimentation with the developed framework, a deep learning solution is proposed capable of dehazing a degraded image in real time restoring the original colors of the image.Una de las dificultades más grandes de la robótica autónoma submarina es lidiar con la falta de visibilidad en imágenes submarinas. La transmisión de la luz en el agua degrada las imágenes dificultando el reconocimiento de objetos y en consecuencia la intervención. Ésta tesis se centra en el análisis del impacto de la degradación de las imágenes submarinas en algoritmos de visión a través de benchmarking, desarrollando un entorno de trabajo en la nube que permite analizar los resultados bajo diferentes condiciones. Teniendo en cuenta los resultados obtenidos con este entorno, se proponen métodos basados en técnicas de aprendizaje profundo para mitigar el impacto de la degradación de las imágenes en tiempo real introduciendo un paso previo que permita recuperar los colores originales
Single Image Super-Resolution Using a Deep Encoder-Decoder Symmetrical Network with Iterative Back Projection
Image super-resolution (SR) usually refers to reconstructing a high resolution (HR) image from a low resolution (LR) image without losing high frequency details or reducing the image quality. Recently, image SR based on convolutional neural network (SRCNN) was proposed and has received much attention due to its end-to-end mapping simplicity and superior performance. This method, however, only using three convolution layers to learn the mapping from LR to HR, usually converges slowly and leads to the size of output image reducing significantly. To address these issues, in this work, we propose a novel deep encoder-decoder symmetrical neural network (DEDSN) for single image SR. This deep network is fully composed of symmetrical multiple layers of convolution and deconvolution and there is no pooling (down-sampling and up-sampling) operations in the whole network so that image details degradation occurred in traditional convolutional frameworks is prevented. Additionally, in view of the success of the iterative back projection (IBP) algorithm in image SR, we further combine DEDSN with IBP network realization in this work. The new DEDSN-IBP model introduces the down sampling version of the ground truth image and calculates the simulation error as the prior guidance. Experimental results on benchmark data sets demonstrate that the proposed DEDSN model can achieve better performance than SRCNN and the improved DEDSN-IBP outperforms the reported state-of-the-art methods
Single Image Super-Resolution Using a Deep Encoder-Decoder Symmetrical Network with Iterative Back Projection
Image super-resolution (SR) usually refers to reconstructing a high resolution (HR) image from a low resolution (LR) image without losing high frequency details or reducing the image quality. Recently, image SR based on convolutional neural network (SRCNN) was proposed and has received much attention due to its end-to-end mapping simplicity and superior performance. This method, however, only using three convolution layers to learn the mapping from LR to HR, usually converges slowly and leads to the size of output image reducing significantly. To address these issues, in this work, we propose a novel deep encoder-decoder symmetrical neural network (DEDSN) for single image SR. This deep network is fully composed of symmetrical multiple layers of convolution and deconvolution and there is no pooling (down-sampling and up-sampling) operations in the whole network so that image details degradation occurred in traditional convolutional frameworks is prevented. Additionally, in view of the success of the iterative back projection (IBP) algorithm in image SR, we further combine DEDSN with IBP network realization in this work. The new DEDSN-IBP model introduces the down sampling version of the ground truth image and calculates the simulation error as the prior guidance. Experimental results on benchmark data sets demonstrate that the proposed DEDSN model can achieve better performance than SRCNN and the improved DEDSN-IBP outperforms the reported state-of-the-art methods
Volumetric performance capture from minimal camera viewpoints
We present a convolutional autoencoder that enables high fidelity volumetric
reconstructions of human performance to be captured from multi-view video
comprising only a small set of camera views. Our method yields similar
end-to-end reconstruction error to that of a probabilistic visual hull computed
using significantly more (double or more) viewpoints. We use a deep prior
implicitly learned by the autoencoder trained over a dataset of view-ablated
multi-view video footage of a wide range of subjects and actions. This opens up
the possibility of high-end volumetric performance capture in on-set and
prosumer scenarios where time or cost prohibit a high witness camera count
General Purpose Audio Effect Removal
Although the design and application of audio effects is well understood, the inverse problem of removing these effects is significantly more challenging and far less studied. Recently, deep learning has been applied to audio effect removal; however, existing approaches have focused on narrow formulations considering only one effect or source type at a time. In realistic scenarios, multiple effects are applied with varying source content. This motivates a more general task, which we refer to as general purpose audio effect removal. We developed a dataset for this task using five audio effects across four different sources and used it to train and evaluate a set of existing architectures. We found that no single model performed optimally on all effect types and sources. To address this, we introduced RemFX, an approach designed to mirror the compositionality of applied effects. We first trained a set of the best-performing effect-specific removal models and then leveraged an audio effect classification model to dynamically construct a graph of our models at inference. We found our approach to outperform single model baselines, although examples with many effects present remain challenging
Recommended from our members
Quantitative Magnetic Resonance Imaging and Analysis of Articular Cartilage and Osteoarthritis
MRI plays an important role in the continuing search for a sensitive osteoarthritis (OA) imaging biomarker able to detect early, pre-morphological alterations in cartilage composition. Determining the compositional recovery pattern of cartilage following acute joint loading could potentially present a more sensitive biomarker for defining cartilage health [1]. However, only a limited amount of studies have assessed both the immediate effect of joint loading on cartilage, as well as its post-loading recovery. In addition, when assessing the compositional responses of cartilage to joint loading, previous studies usually did not incorporate the measurement error of the used quantitative MRI technique into their analysis. Therefore, an uncertainty persists whether or not compositional MRI techniques are sensitive enough to measure changes in water and macromolecular content of cartilage, or if previous studies were merely measuring noise. Consequently, an objective of this thesis is to increase our understanding of and reliability in quantitative T2 and T1ρ relaxation time mapping to detect compositional responses of cartilage following a joint loading activity.
Furthermore, to obtain the quantitative morphological and compositional measures of cartilage, detailed region-specific delineation of cartilage is required. This delineation (or segmentation) of cartilage is laborious and time-consuming as it is usually performed manually by an expert observer. Many new advances in image analysis, particularly those in convolutional neural networks (CNNs) and deep learning, have enabled a time-efficient semi- or fully-automated alternative to this process [2, 3]. This thesis explores the utility of deep CNNs generated segmentations for accurate surface-based analysis of cartilage morphology and composition from knee MRIs as well as of cortical bone thickness from knee CTs.
Chapter 1 will provide an introduction into the structure and biomechanics of articular cartilage and the role of MRI in imaging the degenerative joint disorder, osteoarthritis as well as the effects of different joint loading activities on cartilage morphology and composition.
Chapter 2 explains the principle of MRI and the pulse sequences used in the following chapter for the morphometric and compositional assessment of articular cartilage.
Chapter 3 describes the use of 3D Cartilage Surface Mapping (3D-CaSM) [3] to assess variations in cartilage T1ρ and T2 relaxation times of young, healthy participants following a mild, unilateral stepping activity. By evaluating and incorporating the intrasessional repeatability of the T1ρ and T2 mapping techniques, I aim to highlight those cartilage areas experiencing exercise-induced compositional changes greater than measurement error.
A significant amount of time is needed to manually segment the regions-of-interest required to perform the 3D-CaSM used in Chapter 3. Therefore, in Chapter 4, I assessed the use of deep convolutional neural networks for automating the segmentation process for multiple knee joint tissues simultaneous and increase the time-efficiency for evaluating knee MR datasets. I evaluated the use of a conditional Generative Adversarial Network (cGAN) as a potentially improved method for automated segmentation compared to the widely used convolutional neural network, U-Net.
In Chapter 5 I combined the 3D-CaSM and automated segmentation methods presented in Chapters 3 and 4, respectively to assess the use of fully automatic segmentations of femoral and tibial bone-cartilage structures for accurate surface-based analysis of cartilage morphology and composition on knee MR images. This was performed on publicly available data from the Osteoarthritis Initiative, a multicentre observational study with expert manual segmentations provided by the Zuse Institute in Berlin.
Chapter 6 describes an automated pipeline for subchondral cortical bone thickness mapping from knee CT data. I developed a method of using automated segmentations of articular cartilage and bone from knee MRI data to determine the periarticular bone surface which is covered by cartilage. This surface was then used to perform cortical bone thickness measurements on corresponding CT data. I validated this pipeline using data from the EU-funded, multi-centre observational study called Applied Private-Public partneRship enabling OsteoArthritis Clinical Headway (APPROACH).
Chapter 7 summarises the main conclusions and contributions of the works presented in this thesis as well as providing directions for future work.PhD Studentship funded by GlaxoSmithKlin
Holistic Attention-Fusion Adversarial Network for Single Image Defogging
Adversarial learning-based image defogging methods have been extensively
studied in computer vision due to their remarkable performance. However, most
existing methods have limited defogging capabilities for real cases because
they are trained on the paired clear and synthesized foggy images of the same
scenes. In addition, they have limitations in preserving vivid color and rich
textual details in defogging. To address these issues, we develop a novel
generative adversarial network, called holistic attention-fusion adversarial
network (HAAN), for single image defogging. HAAN consists of a Fog2Fogfree
block and a Fogfree2Fog block. In each block, there are three learning-based
modules, namely, fog removal, color-texture recovery, and fog synthetic, that
are constrained each other to generate high quality images. HAAN is designed to
exploit the self-similarity of texture and structure information by learning
the holistic channel-spatial feature correlations between the foggy image with
its several derived images. Moreover, in the fog synthetic module, we utilize
the atmospheric scattering model to guide it to improve the generative quality
by focusing on an atmospheric light optimization with a novel sky segmentation
network. Extensive experiments on both synthetic and real-world datasets show
that HAAN outperforms state-of-the-art defogging methods in terms of
quantitative accuracy and subjective visual quality.Comment: 13 pages, 10 figure
- …