11 research outputs found

    System Designs for Diabetic Foot Ulcer Image Assessment

    Get PDF
    For individuals with type 2 diabetes, diabetic foot ulcers represent a significant health issue and the wound care cost is quite high. Currently, clinicians and nurses mainly base their wound assessment on visual examination of wound size and the status of the wound tissue. This method is potentially inaccurate for wound assessment and requires extra clinical workload. In view of the prevalence of smartphones with high resolution digital camera, assessing wound healing by analyzing of real-time images using the significant computational power of today’s mobile devices is an attractive approach for managing foot ulcers. Alternatively, the smartphone may be used just for image capture and wireless transfer to a PC or laptop for image processing. To achieve accurate foot ulcer image assessment, we have developed and tested a novel automatic wound image analysis system which accomplishes the following conditions: 1) design of an easy-to-use image capture system which makes the image capture process comfortable for the patient and provides well-controlled image capture conditions; 2) synthesis of efficient and accurate algorithms for real-time wound boundary determination to measure the wound area size; 3) development of a quantitative method to assess the wound healing status based on a foot ulcer image sequence for a given patient and 4) design of a wound image assessment and management system that can be used both in the patient’s home and clinical environment in a tele-medicine fashion. In our work, the wound image is captured by the camera on the smartphone while the patient’s foot is held in place by an image capture box, which is specially design to aid patients in photographing ulcers occurring on the sole of their feet. The experimental results prove that our image capture system guarantees consistent illumination and a fixed distance between the foot and camera. These properties greatly reduce the complexity of the subsequent wound recognition and assessment. The most significant contribution of our work is the development of five different wound boundary determination approaches based on different computer vision algorithms. The first approach employs the level set algorithm to determine the wound boundary directly based on a manually set initial curve. The second and third approaches are the mean-shift segmentation based methods augmented by foot outline detection and analysis. These two approaches have been shown to be efficient to implement (especially on smartphones), prior-knowledge independent and able to provide reasonably accurate wound segmentation results given a set of well-tuned parameters. However, this method suffers from the lack of self-adaptivity due to the fact that it is not based on machine learning. Consequently, a two-stage Support Vector Machine (SVM) binary classifier based wound recognition approach is developed and implemented. This approach consists of three major steps 1) unsupervised super-pixel segmentation, 2) feature descriptor extraction for each super-pixel and 3) supervised classifier based wound boundary determination. The experimental results show that this approach provides promising performance (sensitivity: 73.3%, specificity: 95.6%) when dealing with foot ulcer images captured with our image capture box. In the third approach, we further relax the image capture constraints and generalize the application of our wound recognition system by applying the conditional random field (CRF) based model to solve the wound boundary determination. The key modules in this approach are the TextonBoost based potential learning at different scales and efficient CRF model inference to find the optimal labeling. Finally, the standard K-means clustering algorithm is applied to the determined wound area for color based wound tissue classification. To train the models used in the last two approaches, as well as to evaluate all three methods, we have collected about 100 wound images at the wound clinic in UMass Medical School by tracking 15 patients for a 2-year period, following an IRB approved protocol. The wound recognition results were compared with the ground truth generated by combining clinical labeling from three experienced clinicians. Specificity and sensitivity based measures indicate that the CRF based approach is the most reliable method despite its implementation complexity and computational demands. In addition, sample images of Moulage wound simulations are also used to increase the evaluation flexibility. The advantages and disadvantages of three approaches are described. Another important contribution of this work has been development of a healing score based mechanism for quantitative wound healing status assessment. The wound size and color composition measurements were converted to a score number ranging from 0-10, which indicates the healing trend based on comparisons of subsequent images to an initial foot ulcer image. By comparing the result of the healing score algorithm to the healing scores determined by experienced clinicians, we assess the clinical validity of our healing score algorithm. The level of agreement of our healing score with the three assessing clinicians was quantified by using the Kripendorff’s Alpha Coefficient (KAC). Finally, a collaborative wound image management system between the PC and smartphone was designed and successfully applied in the wound clinic for patients’ wound tracking purpose. This system is proven to be applicable in clinical environment and capable of providing interactive foot ulcer care in a telemedicine fashion

    Image Quality Assessment for Population Cardiac MRI: From Detection to Synthesis

    Get PDF
    Cardiac magnetic resonance (CMR) images play a growing role in diagnostic imaging of cardiovascular diseases. Left Ventricular (LV) cardiac anatomy and function are widely used for diagnosis and monitoring disease progression in cardiology and to assess the patient's response to cardiac surgery and interventional procedures. For population imaging studies, CMR is arguably the most comprehensive imaging modality for non-invasive and non-ionising imaging of the heart and great vessels and, hence, most suited for population imaging cohorts. Due to insufficient radiographer's experience in planning a scan, natural cardiac muscle contraction, breathing motion, and imperfect triggering, CMR can display incomplete LV coverage, which hampers quantitative LV characterization and diagnostic accuracy. To tackle this limitation and enhance the accuracy and robustness of the automated cardiac volume and functional assessment, this thesis focuses on the development and application of state-of-the-art deep learning (DL) techniques in cardiac imaging. Specifically, we propose new image feature representation types that are learnt with DL models and aimed at highlighting the CMR image quality cross-dataset. These representations are also intended to estimate the CMR image quality for better interpretation and analysis. Moreover, we investigate how quantitative analysis can benefit when these learnt image representations are used in image synthesis. Specifically, a 3D fisher discriminative representation is introduced to identify CMR image quality in the UK Biobank cardiac data. Additionally, a novel adversarial learning (AL) framework is introduced for the cross-dataset CMR image quality assessment and we show that the common representations learnt by AL can be useful and informative for cross-dataset CMR image analysis. Moreover, we utilize the dataset invariance (DI) representations for CMR volumes interpolation by introducing a novel generative adversarial nets (GANs) based image synthesis framework, which enhance the CMR image quality cross-dataset

    Parametric active learning techniques for 3D hand pose estimation

    Get PDF
    Active learning (AL) has recently gained popularity for deep learning (DL) models due to efficient and informative sampling, especially when the models require large-scale datasets. The DL models designed for 3D-HPE demand accurate and diverse large-scale datasets that are time-consuming, costly and require experts. This thesis aims to explore AL primarily for the 3D hand pose estimation (3D-HPE) task for the first time. The thesis delves directly into an AL methodology customised for 3D-HPE learners to address this. Because predominantly the learners are regression-based algorithms, a Bayesian approximation of a DL architecture is presented to model uncertainties. This approximation generates data and model- dependent uncertainties that are further combined with the data representativeness AL function, CoreSet, for sampling. Despite being the first work, it creates informative samples and minimal joint errors with less training data on three well-known depth datasets. The second AL algorithm continues to improve the selection following a new trend of parametric samplers. Precisely, this is proceeded task-agnostic with a Graph Convolutional Network (GCN) to offer higher order of representations between labelled and unlabelled data. The newly selected unlabelled images are ranked based on uncertainty or GCN feature distribution. Another novel sampler extends this idea, and tackles encountered AL issues, like cold-start and distribution shift, by training in a self-supervised way with contrastive learning. It shows leveraging the visual concepts from labelled and unlabelled images while attaining state-of-the-art results. The last part of the thesis brings prior AL insights and achievements in a unified parametric-based sampler proposal for the multi-modal 3D-HPE task. This sampler trains multi-variational auto-encoders to align the modalities and provide better selection representation. Several query functions are studied to open a new direction in deep AL sampling.Open Acces

    On Improving Generalization of CNN-Based Image Classification with Delineation Maps Using the CORF Push-Pull Inhibition Operator

    Get PDF
    Deployed image classification pipelines are typically dependent on the images captured in real-world environments. This means that images might be affected by different sources of perturbations (e.g. sensor noise in low-light environments). The main challenge arises by the fact that image quality directly impacts the reliability and consistency of classification tasks. This challenge has, hence, attracted wide interest within the computer vision communities. We propose a transformation step that attempts to enhance the generalization ability of CNN models in the presence of unseen noise in the test set. Concretely, the delineation maps of given images are determined using the CORF push-pull inhibition operator. Such an operation transforms an input image into a space that is more robust to noise before being processed by a CNN. We evaluated our approach on the Fashion MNIST data set with an AlexNet model. It turned out that the proposed CORF-augmented pipeline achieved comparable results on noise-free images to those of a conventional AlexNet classification model without CORF delineation maps, but it consistently achieved significantly superior performance on test images perturbed with different levels of Gaussian and uniform noise

    Enhance Representation Learning of Clinical Narrative with Neural Networks for Clinical Predictive Modeling

    Get PDF
    Medicine is undergoing a technological revolution. Understanding human health from clinical data has major challenges from technical and practical perspectives, thus prompting methods that understand large, complex, and noisy data. These methods are particularly necessary for natural language data from clinical narratives/notes, which contain some of the richest information on a patient. Meanwhile, deep neural networks have achieved superior performance in a wide variety of natural language processing (NLP) tasks because of their capacity to encode meaningful but abstract representations and learn the entire task end-to-end. In this thesis, I investigate representation learning of clinical narratives with deep neural networks through a number of tasks ranging from clinical concept extraction, clinical note modeling, and patient-level language representation. I present methods utilizing representation learning with neural networks to support understanding of clinical text documents. I first introduce the notion of representation learning from natural language processing and patient data modeling. Then, I investigate word-level representation learning to improve clinical concept extraction from clinical notes. I present two works on learning word representations and evaluate them to extract important concepts from clinical notes. The first study focuses on cancer-related information, and the second study evaluates shared-task data. The aims of these two studies are to automatically extract important entities from clinical notes. Next, I present a series of deep neural networks to encode hierarchical, longitudinal, and contextual information for modeling a series of clinical notes. I also evaluate the models by predicting clinical outcomes of interest, including mortality, length of stay, and phenotype predictions. Finally, I propose a novel representation learning architecture to develop a generalized and transferable language representation at the patient level. I also identify pre-training tasks appropriate for constructing a generalizable language representation. The main focus is to improve predictive performance of phenotypes with limited data, a challenging task due to a lack of data. Overall, this dissertation addresses issues in natural language processing for medicine, including clinical text classification and modeling. These studies show major barriers to understanding large-scale clinical notes. It is believed that developing deep representation learning methods for distilling enormous amounts of heterogeneous data into patient-level language representations will improve evidence-based clinical understanding. The approach to solving these issues by learning representations could be used across clinical applications despite noisy data. I conclude that considering different linguistic components in natural language and sequential information between clinical events is important. Such results have implications beyond the immediate context of predictions and further suggest future directions for clinical machine learning research to improve clinical outcomes. This could be a starting point for future phenotyping methods based on natural language processing that construct patient-level language representations to improve clinical predictions. While significant progress has been made, many open questions remain, so I will highlight a few works to demonstrate promising directions

    Deep Learning in Medical Image Analysis

    Get PDF
    The accelerating power of deep learning in diagnosing diseases will empower physicians and speed up decision making in clinical environments. Applications of modern medical instruments and digitalization of medical care have generated enormous amounts of medical images in recent years. In this big data arena, new deep learning methods and computational models for efficient data processing, analysis, and modeling of the generated data are crucially important for clinical applications and understanding the underlying biological process. This book presents and highlights novel algorithms, architectures, techniques, and applications of deep learning for medical image analysis

    Gaze-Based Human-Robot Interaction by the Brunswick Model

    Get PDF
    We present a new paradigm for human-robot interaction based on social signal processing, and in particular on the Brunswick model. Originally, the Brunswick model copes with face-to-face dyadic interaction, assuming that the interactants are communicating through a continuous exchange of non verbal social signals, in addition to the spoken messages. Social signals have to be interpreted, thanks to a proper recognition phase that considers visual and audio information. The Brunswick model allows to quantitatively evaluate the quality of the interaction using statistical tools which measure how effective is the recognition phase. In this paper we cast this theory when one of the interactants is a robot; in this case, the recognition phase performed by the robot and the human have to be revised w.r.t. the original model. The model is applied to Berrick, a recent open-source low-cost robotic head platform, where the gazing is the social signal to be considered

    Building models from multiple point sets with kernel density estimation

    Get PDF
    One of the fundamental problems in computer vision is point set registration. Point set registration finds use in many important applications and in particular can be considered one of the crucial stages involved in the reconstruction of models of physical objects and environments from depth sensor data. The problem of globally aligning multiple point sets, representing spatial shape measurements from varying sensor viewpoints, into a common frame of reference is a complex task that is imperative due to the large number of critical functions that accurate and reliable model reconstructions contribute to. In this thesis we focus on improving the quality and feasibility of model and environment reconstruction through the enhancement of multi-view point set registration techniques. The thesis makes the following contributions: First, we demonstrate that employing kernel density estimation to reason about the unknown generating surfaces that range sensors measure allows us to express measurement variability, uncertainty and also to separate the problems of model design and viewpoint alignment optimisation. Our surface estimates define novel view alignment objective functions that inform the registration process. Our surfaces can be estimated from point clouds in a datadriven fashion. Through experiments on a variety of datasets we demonstrate that we have developed a novel and effective solution to the simultaneous multi-view registration problem. We then focus on constructing a distributed computation framework capable of solving generic high-throughput computational problems. We present a novel task-farming model that we call Semi-Synchronised Task Farming (SSTF), capable of modelling and subsequently solving computationally distributable problems that benefit from both independent and dependent distributed components and a level of communication between process elements. We demonstrate that this framework is a novel schema for parallel computer vision algorithms and evaluate the performance to establish computational gains over serial implementations. We couple this framework with an accurate computation-time prediction model to contribute a novel structure appropriate for addressing expensive real-world algorithms with substantial parallel performance and predictable time savings. Finally, we focus on a timely instance of the multi-view registration problem: modern range sensors provide large numbers of viewpoint samples that result in an abundance of depth data information. The ability to utilise this abundance of depth data in a feasible and principled fashion is of importance to many emerging application areas making use of spatial information. We develop novel methodology for the registration of depth measurements acquired from many viewpoints capturing physical object surfaces. By defining registration and alignment quality metrics based on our density estimation framework we construct an optimisation methodology that implicitly considers all viewpoints simultaneously. We use a non-parametric data-driven approach to consider varying object complexity and guide large view-set spatial transform optimisations. By aligning large numbers of partial, arbitrary-pose views we evaluate this strategy quantitatively on large view-set range sensor data where we find that we can improve registration accuracy over existing methods and contribute increased registration robustness to the magnitude of coarse seed alignment. This allows large-scale registration on problem instances exhibiting varying object complexity with the added advantage of massive parallel efficiency
    corecore