52 research outputs found

    Deep learning analysis of eye fundus images to support medical diagnosis

    Get PDF
    Machine learning techniques have been successfully applied to support medical decision making of cancer, heart diseases and degenerative diseases of the brain. In particular, deep learning methods have been used for early detection of abnormalities in the eye that could improve the diagnosis of different ocular diseases, especially in developing countries, where there are major limitations to access to specialized medical treatment. However, the early detection of clinical signs such as blood vessel, optic disc alterations, exudates, hemorrhages, drusen, and microaneurysms presents three main challenges: the ocular images can be affected by noise artifact, the features of the clinical signs depend specifically on the acquisition source, and the combination of local signs and grading disease label is not an easy task. This research approaches the problem of combining local signs and global labels of different acquisition sources of medical information as a valuable tool to support medical decision making in ocular diseases. Different models for different eye diseases were developed. Four models were developed using eye fundus images: for DME, it was designed a two-stages model that uses a shallow model to predict an exudate binary mask. Then, the binary mask is stacked with the raw fundus image into a 4-channel array as an input of a deep convolutional neural network for diabetic macular edema diagnosis; for glaucoma, it was developed three deep learning models. First, it was defined a deep learning model based on three-stages that contains an initial stage for automatically segment two binary masks containing optic disc and physiological cup segmentation, followed by an automatic morphometric features extraction stage from previous segmentations, and a final classification stage that supports the glaucoma diagnosis with intermediate medical information. Two late-data-fusion methods that fused morphometric features from cartesian and polar segmentation of the optic disc and physiological cup with features extracted from raw eye fundus images. On the other hand, two models were defined using optical coherence tomography. First, a customized convolutional neural network termed as OCT-NET to extract features from OCT volumes to classify DME, DR-DME and AMD conditions. In addition, this model generates images with highlighted local information about the clinical signs, and it estimates the number of slides inside a volume with local abnormalities. Finally, a 3D-Deep learning model that uses OCT volumes as an input to estimate the retinal thickness map useful to grade AMD. The methods were systematically evaluated using ten free public datasets. The methods were compared and validated against other state-of-the-art algorithms and the results were also qualitatively evaluated by ophthalmology experts from Fundación Oftalmológica Nacional. In addition, the proposed methods were tested as a diagnosis support tool of diabetic macular edema, glaucoma, diabetic retinopathy and age-related macular degeneration using two different ocular imaging representations. Thus, we consider that this research could be potentially a big step in building telemedicine tools that could support medical personnel for detecting ocular diseases using eye fundus images and optical coherence tomography.Las técnicas de aprendizaje automático se han aplicado con éxito para apoyar la toma de decisiones médicas sobre el cáncer, las enfermedades cardíacas y las enfermedades degenerativas del cerebro. En particular, se han utilizado métodos de aprendizaje profundo para la detección temprana de anormalidades en el ojo que podrían mejorar el diagnóstico de diferentes enfermedades oculares, especialmente en países en desarrollo, donde existen grandes limitaciones para acceder a tratamiento médico especializado. Sin embargo, la detección temprana de signos clínicos como vasos sanguíneos, alteraciones del disco óptico, exudados, hemorragias, drusas y microaneurismas presenta tres desafíos principales: las imágenes oculares pueden verse afectadas por artefactos de ruido, las características de los signos clínicos dependen específicamente de fuente de adquisición, y la combinación de signos locales y clasificación de la enfermedad no es una tarea fácil. Esta investigación aborda el problema de combinar signos locales y etiquetas globales de diferentes fuentes de adquisición de información médica como una herramienta valiosa para apoyar la toma de decisiones médicas en enfermedades oculares. Se desarrollaron diferentes modelos para diferentes enfermedades oculares. Se desarrollaron cuatro modelos utilizando imágenes de fondo de ojo: para DME, se diseñó un modelo de dos etapas que utiliza un modelo superficial para predecir una máscara binaria de exudados. Luego, la máscara binaria se apila con la imagen de fondo de ojo original en una matriz de 4 canales como entrada de una red neuronal convolucional profunda para el diagnóstico de edema macular diabético; para el glaucoma, se desarrollaron tres modelos de aprendizaje profundo. Primero, se definió un modelo de aprendizaje profundo basado en tres etapas que contiene una etapa inicial para segmentar automáticamente dos máscaras binarias que contienen disco óptico y segmentación fisiológica de la copa, seguido de una etapa de extracción de características morfométricas automáticas de segmentaciones anteriores y una etapa de clasificación final que respalda el diagnóstico de glaucoma con información médica intermedia. Dos métodos de fusión de datos tardíos que fusionaron características morfométricas de la segmentación cartesiana y polar del disco óptico y la copa fisiológica con características extraídas de imágenes de fondo de ojo crudo. Por otro lado, se definieron dos modelos mediante tomografía de coherencia óptica. Primero, una red neuronal convolucional personalizada denominada OCT-NET para extraer características de los volúmenes OCT para clasificar las condiciones DME, DR-DME y AMD. Además, este modelo genera imágenes con información local resaltada sobre los signos clínicos, y estima el número de diapositivas dentro de un volumen con anomalías locales. Finalmente, un modelo de aprendizaje 3D-Deep que utiliza volúmenes OCT como entrada para estimar el mapa de espesor retiniano útil para calificar AMD. Los métodos se evaluaron sistemáticamente utilizando diez conjuntos de datos públicos gratuitos. Los métodos se compararon y validaron con otros algoritmos de vanguardia y los resultados también fueron evaluados cualitativamente por expertos en oftalmología de la Fundación Oftalmológica Nacional. Además, los métodos propuestos se probaron como una herramienta de diagnóstico de edema macular diabético, glaucoma, retinopatía diabética y degeneración macular relacionada con la edad utilizando dos representaciones de imágenes oculares diferentes. Por lo tanto, consideramos que esta investigación podría ser potencialmente un gran paso en la construcción de herramientas de telemedicina que podrían ayudar al personal médico a detectar enfermedades oculares utilizando imágenes de fondo de ojo y tomografía de coherencia óptica.Doctorad

    A Polar Map Based Approach Using Retinal Fundus Images for Glaucoma Detection

    Get PDF
    Cup-to-disc ratio is commonly used as an important parameter for glaucoma screening, involving segmentation of the optic cup on fundus images. We propose a novel polar map representation of the optic disc, using a combination of supervised and unsupervised cup segmentation techniques, for detection of glaucoma. Instead of performing hard thresholding on the segmentation output to extract the cup, we consider the cup confidence scores inside the disc to construct a polar map, and extract sector-wise features for learning a glaucoma risk probability (GRP) for the image. We compare the performance of GRP vis-à-vis the cup-to-disc ratio (CDR). On an evaluation dataset of 100 images from the publicly available RIM-ONE database, our method achieves 82% sensitivity at 84% specificity, and 96% sensitivity at 60% specificity (AUC of 0.8964). Experiments indicate that the polar map based method can provide a more discriminatory glaucoma risk probability score compared to CDR

    Machine Learning Approaches for Automated Glaucoma Detection using Clinical Data and Optical Coherence Tomography Images

    Full text link
    Glaucoma is a multi-factorial, progressive blinding optic-neuropathy. A variety of factors, including genetics, vasculature, anatomy, and immune factors, are involved. Worldwide more than 80 million people are affected by glaucoma, and around 300,000 in Australia, where 50% remain undiagnosed. Untreated glaucoma can lead to blindness. Early detection by Artificial intelligence (AI) is crucial to accelerate the diagnosis process and can prevent further vision loss. Many proposed AI systems have shown promising performance for automated glaucoma detection using two-dimensional (2D) data. However, only a few studies had optimistic outcomes for glaucoma detection and staging. Moreover, the automated AI system still faces challenges in diagnosing at the clinicians’ level due to the lack of interpretability of the ML algorithms and integration of multiple clinical data. AI technology would be welcomed by doctors and patients if the "black box" notion is overcome by developing an explainable, transparent AI system with similar pathological markers used by clinicians as the sign of early detection and progression of glaucomatous damage. Therefore, the thesis aimed to develop a comprehensive AI model to detect and stage glaucoma by incorporating a variety of clinical data and utilising advanced data analysis and machine learning (ML) techniques. The research first focuses on optimising glaucoma diagnostic features by combining structural, functional, demographic, risk factor, and optical coherence tomography (OCT) features. The significant features were evaluated using statistical analysis and trained in ML algorithms to observe the detection performance. Three crucial structural ONH OCT features: cross-sectional 2D radial B-scan, 3D vascular angiography and temporal-superior-nasal-inferior-temporal (TSNIT) B-scan, were analysed and trained in explainable deep learning (DL) models for automated glaucoma prediction. The explanation behind the decision making of DL models were successfully demonstrated using the feature visualisation. The structural features or distinguished affected regions of TSNIT OCT scans were precisely localised for glaucoma patients. This is consistent with the concept of explainable DL, which refers to the idea of making the decision-making processes of DL models transparent and interpretable to humans. However, artifacts and speckle noise often result in misinterpretation of the TSNIT OCT scans. This research also developed an automated DL model to remove the artifacts and noise from the OCT scans, facilitating error-free retinal layers segmentation, accurate tissue thickness estimation and image interpretation. Moreover, to monitor and grade glaucoma severity, the visual field (VF) test is commonly followed by clinicians for treatment and management. Therefore, this research uses the functional features extracted from VF images to train ML algorithms for staging glaucoma from early to advanced/severe stages. Finally, the selected significant features were used to design and develop a comprehensive AI model to detect and grade glaucoma stages based on the data quantity and availability. In the first stage, a DL model was trained with TSNIT OCT scans, and its output was combined with significant structural and functional features and trained in ML models. The best-performed ML model achieved an area under the curve (AUC): 0.98, an accuracy of 97.2%, a sensitivity of 97.9%, and a specificity of 96.4% for detecting glaucoma. The model achieved an overall accuracy of 90.7% and an F1 score of 84.0% for classifying normal, early, moderate, and advanced-stage glaucoma. In conclusion, this thesis developed and proposed a comprehensive, evidence-based AI model that will solve the screening problem for large populations and relieve experts from manually analysing a slew of patient data and associated misinterpretation problems. Moreover, this thesis demonstrated three structural OCT features that could be added as excellent diagnostic markers for precise glaucoma diagnosis

    Retinal Fundus Image Analysis for Diagnosis of Glaucoma: A Comprehensive Survey

    Full text link
    © 2016 IEEE. The rapid development of digital imaging and computer vision has increased the potential of using the image processing technologies in ophthalmology. Image processing systems are used in standard clinical practices with the development of medical diagnostic systems. The retinal images provide vital information about the health of the sensory part of the visual system. Retinal diseases, such as glaucoma, diabetic retinopathy, age-related macular degeneration, Stargardt's disease, and retinopathy of prematurity, can lead to blindness manifest as artifacts in the retinal image. An automated system can be used for offering standardized large-scale screening at a lower cost, which may reduce human errors, provide services to remote areas, as well as free from observer bias and fatigue. Treatment for retinal diseases is available; the challenge lies in finding a cost-effective approach with high sensitivity and specificity that can be applied to large populations in a timely manner to identify those who are at risk at the early stages of the disease. The progress of the glaucoma disease is very often quiet in the early stages. The number of people affected has been increasing and patients are seldom aware of the disease, which can cause delay in the treatment. A review of how computer-aided approaches may be applied in the diagnosis and staging of glaucoma is discussed here. The current status of the computer technology is reviewed, covering localization and segmentation of the optic nerve head, pixel level glaucomatic changes, diagonosis using 3-D data sets, and artificial neural networks for detecting the progression of the glaucoma disease

    Active contour method for ILM segmentation in ONH volume scans in retinal OCT

    Get PDF
    The optic nerve head (ONH) is affected by many neurodegenerative and autoimmune inflammatory conditions. Optical coherence tomography can acquire high-resolution 3D ONH scans. However, the ONH's complex anatomy and pathology make image segmentation challenging. This paper proposes a robust approach to segment the inner limiting membrane (ILM) in ONH volume scans based on an active contour method of Chan-Vese type, which can work in challenging topological structures. A local intensity fitting energy is added in order to handle very inhomogeneous image intensities. A suitable boundary potential is introduced to avoid structures belonging to outer retinal layers being detected as part of the segmentation. The average intensities in the inner and outer region are then resealed locally to account for different brightness values occurring among the ONH center. The appropriate values for the parameters used in the complex computational model are found using an optimization based on the differential evolution algorithm. The evaluation of results showed that the proposed framework significantly improved segmentation results compared to the commercial solution

    Deep Representation Learning with Limited Data for Biomedical Image Synthesis, Segmentation, and Detection

    Get PDF
    Biomedical imaging requires accurate expert annotation and interpretation that can aid medical staff and clinicians in automating differential diagnosis and solving underlying health conditions. With the advent of Deep learning, it has become a standard for reaching expert-level performance in non-invasive biomedical imaging tasks by training with large image datasets. However, with the need for large publicly available datasets, training a deep learning model to learn intrinsic representations becomes harder. Representation learning with limited data has introduced new learning techniques, such as Generative Adversarial Networks, Semi-supervised Learning, and Self-supervised Learning, that can be applied to various biomedical applications. For example, ophthalmologists use color funduscopy (CF) and fluorescein angiography (FA) to diagnose retinal degenerative diseases. However, fluorescein angiography requires injecting a dye, which can create adverse reactions in the patients. So, to alleviate this, a non-invasive technique needs to be developed that can translate fluorescein angiography from fundus images. Similarly, color funduscopy and optical coherence tomography (OCT) are also utilized to semantically segment the vasculature and fluid build-up in spatial and volumetric retinal imaging, which can help with the future prognosis of diseases. Although many automated techniques have been proposed for medical image segmentation, the main drawback is the model's precision in pixel-wise predictions. Another critical challenge in the biomedical imaging field is accurately segmenting and quantifying dynamic behaviors of calcium signals in cells. Calcium imaging is a widely utilized approach to studying subcellular calcium activity and cell function; however, large datasets have yielded a profound need for fast, accurate, and standardized analyses of calcium signals. For example, image sequences from calcium signals in colonic pacemaker cells ICC (Interstitial cells of Cajal) suffer from motion artifacts and high periodic and sensor noise, making it difficult to accurately segment and quantify calcium signal events. Moreover, it is time-consuming and tedious to annotate such a large volume of calcium image stacks or videos and extract their associated spatiotemporal maps. To address these problems, we propose various deep representation learning architectures that utilize limited labels and annotations to address the critical challenges in these biomedical applications. To this end, we detail our proposed semi-supervised, generative adversarial networks and transformer-based architectures for individual learning tasks such as retinal image-to-image translation, vessel and fluid segmentation from fundus and OCT images, breast micro-mass segmentation, and sub-cellular calcium events tracking from videos and spatiotemporal map quantification. We also illustrate two multi-modal multi-task learning frameworks with applications that can be extended to other domains of biomedical applications. The main idea is to incorporate each of these as individual modules to our proposed multi-modal frameworks to solve the existing challenges with 1) Fluorescein angiography synthesis, 2) Retinal vessel and fluid segmentation, 3) Breast micro-mass segmentation, and 4) Dynamic quantification of calcium imaging datasets

    Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation

    Full text link
    In this paper, we examine the recent Segment Anything Model (SAM) on medical images, and report both quantitative and qualitative zero-shot segmentation results on nine medical image segmentation benchmarks, covering various imaging modalities, such as optical coherence tomography (OCT), magnetic resonance imaging (MRI), and computed tomography (CT), as well as different applications including dermatology, ophthalmology, and radiology. Those benchmarks are representative and commonly used in model development. Our experimental results indicate that while SAM presents remarkable segmentation performance on images from the general domain, its zero-shot segmentation ability remains restricted for out-of-distribution images, e.g., medical images. In addition, SAM exhibits inconsistent zero-shot segmentation performance across different unseen medical domains. For certain structured targets, e.g., blood vessels, the zero-shot segmentation of SAM completely failed. In contrast, a simple fine-tuning of it with a small amount of data could lead to remarkable improvement of the segmentation quality, showing the great potential and feasibility of using fine-tuned SAM to achieve accurate medical image segmentation for a precision diagnostics. Our study indicates the versatility of generalist vision foundation models on medical imaging, and their great potential to achieve desired performance through fine-turning and eventually address the challenges associated with accessing large and diverse medical datasets in support of clinical diagnostics.Comment: Published in Diagnostic

    A Foundation LAnguage-Image model of the Retina (FLAIR): Encoding expert knowledge in text supervision

    Full text link
    Foundation vision-language models are currently transforming computer vision, and are on the rise in medical imaging fueled by their very promising generalization capabilities. However, the initial attempts to transfer this new paradigm to medical imaging have shown less impressive performances than those observed in other domains, due to the significant domain shift and the complex, expert domain knowledge inherent to medical-imaging tasks. Motivated by the need for domain-expert foundation models, we present FLAIR, a pre-trained vision-language model for universal retinal fundus image understanding. To this end, we compiled 37 open-access, mostly categorical fundus imaging datasets from various sources, with up to 97 different target conditions and 284,660 images. We integrate the expert's domain knowledge in the form of descriptive textual prompts, during both pre-training and zero-shot inference, enhancing the less-informative categorical supervision of the data. Such a textual expert's knowledge, which we compiled from the relevant clinical literature and community standards, describes the fine-grained features of the pathologies as well as the hierarchies and dependencies between them. We report comprehensive evaluations, which illustrate the benefit of integrating expert knowledge and the strong generalization capabilities of FLAIR under difficult scenarios with domain shifts or unseen categories. When adapted with a lightweight linear probe, FLAIR outperforms fully-trained, dataset-focused models, more so in the few-shot regimes. Interestingly, FLAIR outperforms by a large margin more generalist, larger-scale image-language models, which emphasizes the potential of embedding experts' domain knowledge and the limitations of generalist models in medical imaging.Comment: The pre-trained model is available at: https://github.com/jusiro/FLAI
    corecore