12 research outputs found

    3D object reconstruction using computer vision : reconstruction and characterization applications for external human anatomical structures

    Get PDF
    Tese de doutoramento. Engenharia Informática. Faculdade de Engenharia. Universidade do Porto. 201

    Hyperspectral benthic mapping from underwater robotic platforms

    Get PDF
    We live on a planet of vast oceans; 70% of the Earth's surface is covered in water. They are integral to supporting life, providing 99% of the inhabitable space on Earth. Our oceans and the habitats within them are under threat due to a variety of factors. To understand the impacts and possible solutions, the monitoring of marine habitats is critically important. Optical imaging as a method for monitoring can provide a vast array of information however imaging through water is complex. To compensate for the selective attenuation of light in water, this thesis presents a novel light propagation model and illustrates how it can improve optical imaging performance. An in-situ hyperspectral system is designed which comprised of two upward looking spectrometers at different positions in the water column. The downwelling light in the water column is continuously sampled by the system which allows for the generation of a dynamic water model. In addition to the two upward looking spectrometers the in-situ system contains an imaging module which can be used for imaging of the seafloor. It consists of a hyperspectral sensor and a trichromatic stereo camera. New calibration methods are presented for the spatial and spectral co-registration of the two optical sensors. The water model is used to create image data which is invariant to the changing optical properties of the water and changing environmental conditions. In this thesis the in-situ optical system is mounted onboard an Autonomous Underwater Vehicle. Data from the imaging module is also used to classify seafloor materials. The classified seafloor patches are integrated into a high resolution 3D benthic map of the surveyed site. Given the limited imaging resolution of the hyperspectral sensor used in this work, a new method is also presented that uses information from the co-registered colour images to inform a new spectral unmixing method to resolve subpixel materials

    Towards PACE-CAD Systems

    Get PDF
    Despite phenomenal advancements in the availability of medical image datasets and the development of modern classification algorithms, Computer-Aided Diagnosis (CAD) has had limited practical exposure in the real-world clinical workflow. This is primarily because of the inherently demanding and sensitive nature of medical diagnosis that can have far-reaching and serious repercussions in case of misdiagnosis. In this work, a paradigm called PACE (Pragmatic, Accurate, Confident, & Explainable) is presented as a set of some of must-have features for any CAD. Diagnosis of glaucoma using Retinal Fundus Images (RFIs) is taken as the primary use case for development of various methods that may enrich an ordinary CAD system with PACE. However, depending on specific requirements for different methods, other application areas in ophthalmology and dermatology have also been explored. Pragmatic CAD systems refer to a solution that can perform reliably in day-to-day clinical setup. In this research two, of possibly many, aspects of a pragmatic CAD are addressed. Firstly, observing that the existing medical image datasets are small and not representative of images taken in the real-world, a large RFI dataset for glaucoma detection is curated and published. Secondly, realising that a salient attribute of a reliable and pragmatic CAD is its ability to perform in a range of clinically relevant scenarios, classification of 622 unique cutaneous diseases in one of the largest publicly available datasets of skin lesions is successfully performed. Accuracy is one of the most essential metrics of any CAD system's performance. Domain knowledge relevant to three types of diseases, namely glaucoma, Diabetic Retinopathy (DR), and skin lesions, is industriously utilised in an attempt to improve the accuracy. For glaucoma, a two-stage framework for automatic Optic Disc (OD) localisation and glaucoma detection is developed, which marked new state-of-the-art for glaucoma detection and OD localisation. To identify DR, a model is proposed that combines coarse-grained classifiers with fine-grained classifiers and grades the disease in four stages with respect to severity. Lastly, different methods of modelling and incorporating metadata are also examined and their effect on a model's classification performance is studied. Confidence in diagnosing a disease is equally important as the diagnosis itself. One of the biggest reasons hampering the successful deployment of CAD in the real-world is that medical diagnosis cannot be readily decided based on an algorithm's output. Therefore, a hybrid CNN architecture is proposed with the convolutional feature extractor trained using point estimates and a dense classifier trained using Bayesian estimates. Evaluation on 13 publicly available datasets shows the superiority of this method in terms of classification accuracy and also provides an estimate of uncertainty for every prediction. Explainability of AI-driven algorithms has become a legal requirement after Europe’s General Data Protection Regulations came into effect. This research presents a framework for easy-to-understand textual explanations of skin lesion diagnosis. The framework is called ExAID (Explainable AI for Dermatology) and relies upon two fundamental modules. The first module uses any deep skin lesion classifier and performs detailed analysis on its latent space to map human-understandable disease-related concepts to the latent representation learnt by the deep model. The second module proposes Concept Localisation Maps, which extend Concept Activation Vectors by locating significant regions corresponding to a learned concept in the latent space of a trained image classifier. This thesis probes many viable solutions to equip a CAD system with PACE. However, it is noted that some of these methods require specific attributes in datasets and, therefore, not all methods may be applied on a single dataset. Regardless, this work anticipates that consolidating PACE into a CAD system can not only increase the confidence of medical practitioners in such tools but also serve as a stepping stone for the further development of AI-driven technologies in healthcare

    Advanced Image Acquisition, Processing Techniques and Applications

    Get PDF
    "Advanced Image Acquisition, Processing Techniques and Applications" is the first book of a series that provides image processing principles and practical software implementation on a broad range of applications. The book integrates material from leading researchers on Applied Digital Image Acquisition and Processing. An important feature of the book is its emphasis on software tools and scientific computing in order to enhance results and arrive at problem solution

    Towards an efficient, unsupervised and automatic face detection system for unconstrained environments

    Get PDF
    Nowadays, there is growing interest in face detection applications for unconstrained environments. The increasing need for public security and national security motivated our research on the automatic face detection system. For public security surveillance applications, the face detection system must be able to cope with unconstrained environments, which includes cluttered background and complicated illuminations. Supervised approaches give very good results on constrained environments, but when it comes to unconstrained environments, even obtaining all the training samples needed is sometimes impractical. The limitation of supervised approaches impels us to turn to unsupervised approaches. In this thesis, we present an efficient and unsupervised face detection system, which is feature and configuration based. It combines geometric feature detection and local appearance feature extraction to increase stability and performance of the detection process. It also contains a novel adaptive lighting compensation approach to normalize the complicated illumination in real life environments. We aim to develop a system that has as few assumptions as possible from the very beginning, is robust and exploits accuracy/complexity trade-offs as much as possible. Although our attempt is ambitious for such an ill posed problem-we manage to tackle it in the end with very few assumptions.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Connecting mathematical models for image processing and neural networks

    Get PDF
    This thesis deals with the connections between mathematical models for image processing and deep learning. While data-driven deep learning models such as neural networks are flexible and well performing, they are often used as a black box. This makes it hard to provide theoretical model guarantees and scientific insights. On the other hand, more traditional, model-driven approaches such as diffusion, wavelet shrinkage, and variational models offer a rich set of mathematical foundations. Our goal is to transfer these foundations to neural networks. To this end, we pursue three strategies. First, we design trainable variants of traditional models and reduce their parameter set after training to obtain transparent and adaptive models. Moreover, we investigate the architectural design of numerical solvers for partial differential equations and translate them into building blocks of popular neural network architectures. This yields criteria for stable networks and inspires novel design concepts. Lastly, we present novel hybrid models for inpainting that rely on our theoretical findings. These strategies provide three ways for combining the best of the two worlds of model- and data-driven approaches. Our work contributes to the overarching goal of closing the gap between these worlds that still exists in performance and understanding.Gegenstand dieser Arbeit sind die Zusammenhänge zwischen mathematischen Modellen zur Bildverarbeitung und Deep Learning. Während datengetriebene Modelle des Deep Learning wie z.B. neuronale Netze flexibel sind und gute Ergebnisse liefern, werden sie oft als Black Box eingesetzt. Das macht es schwierig, theoretische Modellgarantien zu liefern und wissenschaftliche Erkenntnisse zu gewinnen. Im Gegensatz dazu bieten traditionellere, modellgetriebene Ansätze wie Diffusion, Wavelet Shrinkage und Variationsansätze eine Fülle von mathematischen Grundlagen. Unser Ziel ist es, diese auf neuronale Netze zu übertragen. Zu diesem Zweck verfolgen wir drei Strategien. Zunächst entwerfen wir trainierbare Varianten von traditionellen Modellen und reduzieren ihren Parametersatz, um transparente und adaptive Modelle zu erhalten. Außerdem untersuchen wir die Architekturen von numerischen Lösern für partielle Differentialgleichungen und übersetzen sie in Bausteine von populären neuronalen Netzwerken. Daraus ergeben sich Kriterien für stabile Netzwerke und neue Designkonzepte. Schließlich präsentieren wir neuartige hybride Modelle für Inpainting, die auf unseren theoretischen Erkenntnissen beruhen. Diese Strategien bieten drei Möglichkeiten, das Beste aus den beiden Welten der modell- und datengetriebenen Ansätzen zu vereinen. Diese Arbeit liefert einen Beitrag zum übergeordneten Ziel, die Lücke zwischen den zwei Welten zu schließen, die noch in Bezug auf Leistung und Modellverständnis besteht.ERC Advanced Grant INCOVI

    Projection-based Spatial Augmented Reality for Interactive Visual Guidance in Surgery

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    An application specific low bit-rate video compression system geared towards vehicle tracking.

    Get PDF
    Thesis (M.Sc.Eng.)-University of Natal, Durban, 2003.The ability to communicate over a low bit-rate transmission channel has become the order of the day. In the past, transmitted data over a low bit-rate transmission channel, such as a wireless link, has typically been reserved for speech and data. However, there is currently a great deal of interest being shown in the ability to transmit streaming video over such a link. These transmission channels are generally bandwidth limited hence bit-rates need to be low. Video on the other hand requires large amounts of bandwidth for real-time streaming applications. Existing Video Compression standards such as MPEG-l/2 have succeeded in reducing the bandwidth required for transmission by exploiting redundant video information in both the spatial and temporal domains. However such compression systems are geared towards general applications hence they tend not to be suitable for low bit-rate applications. The objective of this work is to implement such a system. Following an investigation in the field of video compression, existing techniques have been adapted and integrated into an application specific low bit-rate video compression system. The implemented system is application specific as it has been designed to track vehicles of reasonable size within an otherwise static scene. Low bit-rate video is achieved by separating a video scene into two areas of interest, namely the background scene and objects that move with reference to this background. Once the background has been compressed and transmitted to the decoder, the only data that is subsequently transmitted is that that has resulted from the segmentation and tracking of vehicles within the scene. This data is normally small in comparison with that of the background scene and therefore by only updating the background periodically, the resulting average output bit-rate is low. The implemented system is divided into two parts, namely a still image encoder and decoder based on a Variable Block-Size Discrete Cosine Transform, and a context-specific encoder and decoder that tracks vehicles in motion within a video scene. The encoder system has been implemented on the Philips TriMedia TM-1300 digital signal processor (DSP). The encoder is able to capture streaming video, compress individual video frames as well as track objects in motion within a video scene. The decoder on the other hand has been implemented on the host PC in which the TriMedia DSP is plugged. A graphic user interface allows a system operator to control the compression system by configuring various compression variables. For demonstration purposes, the host PC displays the decoded video stream as well as calculated rate metrics such as peak signal to noise ratio and resultant bit-rate. The implementation of the compression system is described whilst incorporating application examples and results. Conclusions are drawn and suggestions for further improvement are offered

    Gaze-Based Human-Robot Interaction by the Brunswick Model

    Get PDF
    We present a new paradigm for human-robot interaction based on social signal processing, and in particular on the Brunswick model. Originally, the Brunswick model copes with face-to-face dyadic interaction, assuming that the interactants are communicating through a continuous exchange of non verbal social signals, in addition to the spoken messages. Social signals have to be interpreted, thanks to a proper recognition phase that considers visual and audio information. The Brunswick model allows to quantitatively evaluate the quality of the interaction using statistical tools which measure how effective is the recognition phase. In this paper we cast this theory when one of the interactants is a robot; in this case, the recognition phase performed by the robot and the human have to be revised w.r.t. the original model. The model is applied to Berrick, a recent open-source low-cost robotic head platform, where the gazing is the social signal to be considered
    corecore