28 research outputs found

    Deep learning for diabetic retinopathy detection and classification based on fundus images: A review.

    Get PDF
    Diabetic Retinopathy is a retina disease caused by diabetes mellitus and it is the leading cause of blindness globally. Early detection and treatment are necessary in order to delay or avoid vision deterioration and vision loss. To that end, many artificial-intelligence-powered methods have been proposed by the research community for the detection and classification of diabetic retinopathy on fundus retina images. This review article provides a thorough analysis of the use of deep learning methods at the various steps of the diabetic retinopathy detection pipeline based on fundus images. We discuss several aspects of that pipeline, ranging from the datasets that are widely used by the research community, the preprocessing techniques employed and how these accelerate and improve the models' performance, to the development of such deep learning models for the diagnosis and grading of the disease as well as the localization of the disease's lesions. We also discuss certain models that have been applied in real clinical settings. Finally, we conclude with some important insights and provide future research directions

    End-To-End Multi-Task Learning Approaches for the Joint Epiretinal Membrane Segmentation and Screening in OCT Images

    Get PDF
    Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] Background and objectives The Epiretinal Membrane (ERM) is an ocular disease that can cause visual distortions and irreversible vision loss. Patient sight preservation relies on an early diagnosis and on determining the location of the ERM in order to be treated and potentially removed. In this context, the visual inspection of the images in order to screen for ERM signs is a costly and subjective process. Methods In this work, we propose and study three end-to-end fully-automatic approaches for the simultaneous segmentation and screening of ERM signs in Optical Coherence Tomography images. These convolutional approaches exploit a multi-task learning context to leverage inter-task complementarity in order to guide the training process. The proposed architectures are combined with three different state of the art encoder architectures of reference in order to provide an exhaustive study of the suitability of each of the approaches for these tasks. Furthermore, these architectures work in an end-to-end manner, entailing a significant simplification of the development process since they are able to be trained directly from annotated images without the need for a series of purpose-specific steps. Results In terms of segmentation, the proposed models obtained a precision of 0.760 ± 0.050, a sensitivity of 0.768 ± 0.210 and a specificity of 0.945 ± 0.011. For the screening task, these models achieved a precision of 0.963 ± 0.068, a sensitivity of 0.816 ± 0.162 and a specificity of 0.983 ± 0.068. The obtained results show that these multi-task approaches are able to perform competitively with or even outperform single-task methods tailored for either the segmentation or the screening of the ERM. Conclusions These results highlight the advantages of using complementary knowledge related to the segmentation and screening tasks in the diagnosis of this relevant pathology, constituting the first proposal to address the diagnosis of the ERM from a multi-task perspective.This research was funded by Instituto de Salud Carlos III, Government of Spain, [grant number DTS18/00136]; Ministerio de Ciencia e Innovación y Universidades, Government of Spain, [grant number RTI2018-095894-B-I00]; Ministerio de Ciencia e Innovación, Government of Spain through the research project with [grant number PID2019-108435RB-I00]; Consellería de Cultura, Educación e Universidade, Xunta de Galicia, Grupos de Referencia Competitiva, [grant number ED431C 2020/24], Predoctoral grant [grant number ED481A 2021/161] and Postdoctoral grant [grant number ED481B 2021/059]; Axencia Galega de Innovación (GAIN), Xunta de Galicia, [grant number IN845D 2020/38]; CITIC, Centro de Investigación de Galicia [grant number ED431G 2019/01], receives financial support from Consellería de Educación, Universidade e Formación Profesional, Xunta de Galicia, through the ERDF (80%) and Secretaría Xeral de Universidades (20%). The funding sources had no role in the development of this work. Funding for open access charge: Universidade da Coruña/CISUGXunta de Galicia; ED431C 2020/24Xunta de Galicia; ED481A 2021/161Xunta de Galicia; ED481B 2021/059Xunta de Galicia; IN845D 2020/38Xunta de Galicia; ED431G 2019/0

    Système d'apprentissage multitâche dédié à la segmentation des lésions sombres et claires de la rétine dans les images de fond d'oeil

    Get PDF
    Le travail de recherche mené dans le cadre de cette maîtrise porte sur l’exploitation de l’imagerie de la rétine à des fins de diagnostic automatique. Il se concentre sur l’image de fond d’oeil, qui donne accès à une représentation en deux dimensions et en couleur de la surface de la rétine. Ces images peuvent présenter des symptômes de maladie, sous forme de lésions ou de déformations des structures anatomiques de la rétine. L’objet de cette maîtrise est de proposer une méthodologie de segmentation simultanée de ces lésions dans l’image de fond d’oeil, regroupées en deux catégories : claires ou sombres. Réaliser cette double segmentation de façon simultanée est inédit : la vaste majorité des travaux précédents se concentrant sur un seul type de lésions. Or, du fait des contraintes de temps et de la difficulté que cela représente dans un environnement clinique, il est impossible pour un clinicien de tester la multitude d’algorithmes existants. D’autant plus que lorsqu’un patient se présente pour un examen, le clinicien n’a aucune connaissance a priori sur le type de pathologie et par conséquent sur le type d’algorithme à utiliser. Pour envisager une utilisation clinique, il est donc important de réfléchir à une solution polyvalente, rapide et aisément déployable. Parallèlement, l’apprentissage profond a démontré sa capacité à s’adapter à de nombreux problèmes de visions par ordinateur et à généraliser ses performances sur des données variées malgré des ensembles d’entraînement parfois restreints. Pour cela, de nouvelles stratégies sont régulièrement proposées, ambitionnant d’extraire toujours mieux les informations issues de la base d’entraînement. En conséquence, nous nous sommes fixés pour objectif de développer une architecture de réseaux de neurones capable de rechercher toutes les lésions dans une image de fond d’oeil. Pour répondre à cet objectif, notre méthodologie s’appuie sur une nouvelle architecture de réseaux de neurones convolutifs reposant sur une structure multitâche entraînée selon une approche hybride faisant appel à de l’apprentissage supervisé et faiblement supervisé. L’architecture se compose d’un encodeur partagé par deux décodeurs spécialisés chacun dans un type de lésions. Ainsi, les mêmes caractéristiques sont extraites par l’encodeur pour les deux décodeurs. Dans un premier temps, le réseau est entraîné avec des régions d’images et la vérité terrain correspondante indiquant les lésions (apprentissage supervisé). Dans un second temps, seul l’encodeur est ré-entraîné avec des images complètes avec une vérité terrain composé d’un simple scalaire indiquant si l’image présente des pathologies ou non, sans préciser leur position et leur type (apprentissage faiblement supervisé).----------ABSTRACT: This work focuses on automatic diagnosis on fundus images, which are a bidimensional representation of the inner structure of the eye. The aim of this master’s thesis is to discuss a solution for an automatic segmentation of the lesions that can be observed in the retina. The proposed methodology regroups those lesions in two categories: red and bright. Obtaining a simultaneous double segmentation is a novel approach; most of the previous works focus on the detection of a single type of lesions. However, due to time constraints and the tedeous nature of this work, clinicians usually can not test all the existing methods. Moreover, from a screening perspective, the clinician has no clue a priori on the nature of the pathology he deals with and thus on which algorithm to start with. Therefore, the proposed algorithm requires to be versatile, fast and easily deployable. Conforted by the recent progresses obtained with machine learning methods (and especially deep learning), we decide to develop a novel convolutional neural network able to segment both types of lesions on fundus images. To reach this goal, our methodology relies on a new multitask architecture, trained on a hybrid method combining weak and normal supervised training. The architecture relies on hard parameter sharing: two decoders (one per type of lesion) share a single encoder. Therefore, the encoder is trained on deriving an abstrast representation of the input image. Those extracted features permit a discrimination between both bright and red lesions. In other words, the encoder is trained on detecting pathological tissues from normal ones. The training is done in two steps. During the first one, the whole architecture is trained with patches, with a groundtruth at a pixel level, which is the typical way of training a segmentation network. The second step consists in weak supervision. Only the encoder is trained with full images and its task is to predict the status of the given image (pathological or healthy), without specifying anything concerning the potential lesions in it (neither location nor type). In this case, the groundtruth is a simple boolean number. This second step allows the network to see a larger number of images: indeed, this type of groundtruth is considerably easier to acquire and already available in large public databases. This step relies on the hypothesis that it is possible to use an annotation at an image level (globally) to enhance the performance at a pixel level (locally). This is an intuitive idea, as the pathological status is directly correlated with the presence of lesions

    Diabetic retinopathy grading with respect to the segmented lesions

    Get PDF
    One of the leading causes of irreversible vision loss is Diabetic Retinopathy (DR). The International Clinical Diabetic Retinopathy scale (ICDRS) provides grading criteria for DR. Deep Convolutional Neural Networks (DCNNs) have high performance in DR grading in terms of classification evaluation metrics; however, these metrics are not sufficient for evaluation. The eXplainable Artificial Intelligence (XAI) methodology provides insight into the decisions made by networks by producing sparce, generic heat maps highlighting the most critical DR features. XAI also could not satisfy clinical criteria due to the lack of explanation on the number and types of lesions. Hence, we propose a computational tool box that provides lesion-based explanation according to grading system criteria for determining severity levels. According to ICDRS, DR has 10 major lesions and 4 severity levels. Experienced clinicians annotated 143 DR fundus images and we developed a toolbox containing 9 lesion-specified segmentation networks. Networks should detect lesions with high annotation resolution and then compute DR severity grade according to ICDRS. The network that was employed in this study is the optimized version of Holistically Nested Edge Detection Network (HEDNet). Using this model, the lesions such as hard exudates (Ex), cotton wool spots (CWS), neovascularization(NV), intraretinal haemorrhages (IHE) and vitreous preretinal haemorrhages (VPHE) were properly detected but the prediction of lesions such as venous beading (VB), microaneurysms (MA), intraretinal microvascular abnormalities (IRMA) and fibrous proliferation (FP) had lower mAPs. Consequently, this will affect the value of grading which uses the segmented masks of all contributing lesions

    The Effectiveness of Transfer Learning Systems on Medical Images

    Get PDF
    Deep neural networks have revolutionized the performances of many machine learning tasks such as medical image classification and segmentation. Current deep learning (DL) algorithms, specifically convolutional neural networks are increasingly becoming the methodological choice for most medical image analysis. However, training these deep neural networks requires high computational resources and very large amounts of labeled data which is often expensive and laborious. Meanwhile, recent studies have shown the transfer learning (TL) paradigm as an attractive choice in providing promising solutions to challenges of shortage in the availability of labeled medical images. Accordingly, TL enables us to leverage the knowledge learned from related data to solve a new problem. The objective of this dissertation is to examine the effectiveness of TL systems on medical images. First, a comprehensive systematic literature review was performed to provide an up-to-date status of TL systems on medical images. Specifically, we proposed a novel conceptual framework to organize the review. Second, a novel DL network was pretrained on natural images and utilized to evaluate the effectiveness of TL on a very large medical image dataset, specifically Chest X-rays images. Lastly, domain adaptation using an autoencoder was evaluated on the medical image dataset and the results confirmed the effectiveness of TL through fine-tuning strategies. We make several contributions to TL systems on medical image analysis: Firstly, we present a novel survey of TL on medical images and propose a new conceptual framework to organize the findings. Secondly, we propose a novel DL architecture to improve learned representations of medical images while mitigating the problem of vanishing gradients. Additionally, we identified the optimal cut-off layer (OCL) that provided the best model performance. We found that the higher layers in the proposed deep model give a better feature representation of our medical image task. Finally, we analyzed the effect of domain adaptation by fine-tuning an autoencoder on our medical images and provide theoretical contributions on the application of the transductive TL approach. The contributions herein reveal several research gaps to motivate future research and contribute to the body of literature in this active research area of TL systems on medical image analysis

    Explainable AI for retinal OCT diagnosis

    Get PDF
    Artificial intelligence methods such as deep learning are leading to great progress in complex tasks that are usually associated with human intelligence and experience. Deep learning models have matched if not bettered human performance for medical diagnosis tasks including retinal diagnosis. Given a sufficient amount of data and computational resources, these models can perform classification and segmentation as well as related tasks such as image quality improvement. The adoption of these systems in actual healthcare centers has been limited due to the lack of reasoning behind their decisions. This black box nature along with upcoming regulations for transparency and privacy exacerbates the ethico-legal challenges faced by deep learning systems. The attribution methods are a way to explain the decisions of a deep learning model by generating a heatmap of the features which have the most contribution to the model's decision. These are generally compared in quantitative terms for standard machine learning datasets. However, the ability of these methods to generalize to specific data distributions such as retinal OCT has not been thoroughly evaluated. In this thesis, multiple attribution methods to explain the decisions of deep learning models for retinal diagnosis are compared. It is evaluated if the methods considered the best for explainability outperform the methods with a relatively simpler theoretical background. A review of current deep learning models for retinal diagnosis and the state-of-the-art explainability methods for medical diagnosis is provided. A commonly used deep learning model is trained on a large public dataset of OCT images and the attributions are generated using various methods. A quantitative and qualitative comparison of these approaches is done using several performance metrics and a large panel of experienced retina specialists. The initial quantitative metrics include the runtime of the method, RMSE, and Spearman's rank correlation for a single instance of the model. Later, two stronger metrics - robustness and sensitivity are presented. These evaluate the consistency amongst different instances of the same model and the ability to highlight the features with the most effect on the model output respectively. Similarly, the initial qualitative analysis involves the comparison between the heatmaps and a clinician's markings in terms of cosine similarity. Next, a panel of 14 clinicians rated the heatmaps of each method. Their subjective feedback, reasons for preference, and general feedback about using such a system are also documented. It is concluded that the explainability methods can make the decision process of deep learning models more transparent and the choice of the method should account for the preference of the domain experts. There is a high degree of acceptance from the clinicians surveyed for using such systems. The future directions regarding system improvements and enhancements are also discussed

    Automating the eye examination using optical coherence tomography

    Get PDF
    Optical coherence tomography (OCT) devices are becoming ubiquitous in eye clinics worldwide to aid the diagnosis and monitoring of eye disease. Much of this uptake relates to the ability to non-invasively capture micron-resolution images, enabling objective and quantitative data to be obtained from ocular structures. Although safe and reasonably quick to perform, the costs involved with operating OCT devices are not trivial, and the requirement for OCT and other imaging in addition to other clinical measures is placing increasing demand on ophthalmology clinics, contributing to fragmented patient pathways and often extended waiting times. In this thesis, a novel “binocular optical coherence tomography” system that seeks to overcome some of the limitations of current commercial OCT systems, is clinically evaluated. This device incorporates many aspects of the eye examination into a single patient-operated instrument, and aims to improve the efficiency and quality of eye care while reducing the overall labour and equipment costs. A progressive framework of testing is followed that includes human factors and usability testing, followed by early stage diagnostic studies to assess the agreement, repeatability, and reproducibility of individual diagnostic features. Health economics analysis of the retinal therapy clinic is used to model cost effectiveness of current practice and with binocular OCT implementation. The binocular OCT and development of other low-cost OCT systems may improve accessibility, however there remains a relative shortage of experts to interpret the images. Artificial intelligence (AI) is likely to play a role in rapid and automated image classification. This thesis explores the application of AI within retinal therapy clinics to predict the onset of exudative age-related macular degeneration in fellow eyes of patients undergoing treatment in their first eye. Together with automated and simultaneous imaging of both eyes with binocular OCT and the potential for low-cost patient-facing systems, AI is likely to have a role in personalising management plans, especially in a future where preventive treatments are available
    corecore