21 research outputs found

    A new technique for cataract eye disease diagnosis in deep learning

    Get PDF
    Automated diagnosis of eye diseases using fundus images is challenging because manual analysis is time-consuming, prone to errors, and complicated. Thus, computer-aided tools for automatically detecting various ocular disorders from fundus images are needed. Deep learning algorithms enable improved image classification, making automated targeted ocular disease detection feasible. This study employed state-of-the-art deep learning image classifiers, such as VGG-19, to categorize the highly imbalanced ODIR-5K (Ocular Disease Intelligent Recognition) dataset of 5000 fundus images across eight disease classes, including cataract, glaucoma, diabetic retinopathy, and age-related macular degeneration. To address this imbalance, the multiclass problem is converted into binary classification tasks with equal samples in each category. The dataset was preprocessed and augmented to generate balanced datasets. The binary classifiers were trained on flat data using the VGG-19 (Visual Geometry Group) model. This approach achieved an accuracy of 95% for distinguishing normal versus cataract cases in only 15 epochs, outperforming the previous methods. Precision and recall were high for both classes – Normal and Cataract, with F1 scores of 0.95-0.96. Balancing the dataset and using deep VGG-19 classifiers significantly improved automated eye disease diagnosis accuracy from fundus images. With further research, this approach could lead to deploying AI (Artificial intelligence)-assisted tools for ophthalmologists to screen patients and support clinical decision-making

    SystÚme d'apprentissage multitùche dédié à la segmentation des lésions sombres et claires de la rétine dans les images de fond d'oeil

    Get PDF
    Le travail de recherche menĂ© dans le cadre de cette maĂźtrise porte sur l’exploitation de l’imagerie de la rĂ©tine Ă  des fins de diagnostic automatique. Il se concentre sur l’image de fond d’oeil, qui donne accĂšs Ă  une reprĂ©sentation en deux dimensions et en couleur de la surface de la rĂ©tine. Ces images peuvent prĂ©senter des symptĂŽmes de maladie, sous forme de lĂ©sions ou de dĂ©formations des structures anatomiques de la rĂ©tine. L’objet de cette maĂźtrise est de proposer une mĂ©thodologie de segmentation simultanĂ©e de ces lĂ©sions dans l’image de fond d’oeil, regroupĂ©es en deux catĂ©gories : claires ou sombres. RĂ©aliser cette double segmentation de façon simultanĂ©e est inĂ©dit : la vaste majoritĂ© des travaux prĂ©cĂ©dents se concentrant sur un seul type de lĂ©sions. Or, du fait des contraintes de temps et de la difficultĂ© que cela reprĂ©sente dans un environnement clinique, il est impossible pour un clinicien de tester la multitude d’algorithmes existants. D’autant plus que lorsqu’un patient se prĂ©sente pour un examen, le clinicien n’a aucune connaissance a priori sur le type de pathologie et par consĂ©quent sur le type d’algorithme Ă  utiliser. Pour envisager une utilisation clinique, il est donc important de rĂ©flĂ©chir Ă  une solution polyvalente, rapide et aisĂ©ment dĂ©ployable. ParallĂšlement, l’apprentissage profond a dĂ©montrĂ© sa capacitĂ© Ă  s’adapter Ă  de nombreux problĂšmes de visions par ordinateur et Ă  gĂ©nĂ©raliser ses performances sur des donnĂ©es variĂ©es malgrĂ© des ensembles d’entraĂźnement parfois restreints. Pour cela, de nouvelles stratĂ©gies sont rĂ©guliĂšrement proposĂ©es, ambitionnant d’extraire toujours mieux les informations issues de la base d’entraĂźnement. En consĂ©quence, nous nous sommes fixĂ©s pour objectif de dĂ©velopper une architecture de rĂ©seaux de neurones capable de rechercher toutes les lĂ©sions dans une image de fond d’oeil. Pour rĂ©pondre Ă  cet objectif, notre mĂ©thodologie s’appuie sur une nouvelle architecture de rĂ©seaux de neurones convolutifs reposant sur une structure multitĂąche entraĂźnĂ©e selon une approche hybride faisant appel Ă  de l’apprentissage supervisĂ© et faiblement supervisĂ©. L’architecture se compose d’un encodeur partagĂ© par deux dĂ©codeurs spĂ©cialisĂ©s chacun dans un type de lĂ©sions. Ainsi, les mĂȘmes caractĂ©ristiques sont extraites par l’encodeur pour les deux dĂ©codeurs. Dans un premier temps, le rĂ©seau est entraĂźnĂ© avec des rĂ©gions d’images et la vĂ©ritĂ© terrain correspondante indiquant les lĂ©sions (apprentissage supervisĂ©). Dans un second temps, seul l’encodeur est rĂ©-entraĂźnĂ© avec des images complĂštes avec une vĂ©ritĂ© terrain composĂ© d’un simple scalaire indiquant si l’image prĂ©sente des pathologies ou non, sans prĂ©ciser leur position et leur type (apprentissage faiblement supervisĂ©).----------ABSTRACT: This work focuses on automatic diagnosis on fundus images, which are a bidimensional representation of the inner structure of the eye. The aim of this master’s thesis is to discuss a solution for an automatic segmentation of the lesions that can be observed in the retina. The proposed methodology regroups those lesions in two categories: red and bright. Obtaining a simultaneous double segmentation is a novel approach; most of the previous works focus on the detection of a single type of lesions. However, due to time constraints and the tedeous nature of this work, clinicians usually can not test all the existing methods. Moreover, from a screening perspective, the clinician has no clue a priori on the nature of the pathology he deals with and thus on which algorithm to start with. Therefore, the proposed algorithm requires to be versatile, fast and easily deployable. Conforted by the recent progresses obtained with machine learning methods (and especially deep learning), we decide to develop a novel convolutional neural network able to segment both types of lesions on fundus images. To reach this goal, our methodology relies on a new multitask architecture, trained on a hybrid method combining weak and normal supervised training. The architecture relies on hard parameter sharing: two decoders (one per type of lesion) share a single encoder. Therefore, the encoder is trained on deriving an abstrast representation of the input image. Those extracted features permit a discrimination between both bright and red lesions. In other words, the encoder is trained on detecting pathological tissues from normal ones. The training is done in two steps. During the first one, the whole architecture is trained with patches, with a groundtruth at a pixel level, which is the typical way of training a segmentation network. The second step consists in weak supervision. Only the encoder is trained with full images and its task is to predict the status of the given image (pathological or healthy), without specifying anything concerning the potential lesions in it (neither location nor type). In this case, the groundtruth is a simple boolean number. This second step allows the network to see a larger number of images: indeed, this type of groundtruth is considerably easier to acquire and already available in large public databases. This step relies on the hypothesis that it is possible to use an annotation at an image level (globally) to enhance the performance at a pixel level (locally). This is an intuitive idea, as the pathological status is directly correlated with the presence of lesions

    Deep learning for diabetic retinopathy detection and classification based on fundus images: A review.

    Get PDF
    Diabetic Retinopathy is a retina disease caused by diabetes mellitus and it is the leading cause of blindness globally. Early detection and treatment are necessary in order to delay or avoid vision deterioration and vision loss. To that end, many artificial-intelligence-powered methods have been proposed by the research community for the detection and classification of diabetic retinopathy on fundus retina images. This review article provides a thorough analysis of the use of deep learning methods at the various steps of the diabetic retinopathy detection pipeline based on fundus images. We discuss several aspects of that pipeline, ranging from the datasets that are widely used by the research community, the preprocessing techniques employed and how these accelerate and improve the models' performance, to the development of such deep learning models for the diagnosis and grading of the disease as well as the localization of the disease's lesions. We also discuss certain models that have been applied in real clinical settings. Finally, we conclude with some important insights and provide future research directions

    Deep Learning Techniques for Automated Analysis and Processing of High Resolution Medical Imaging

    Get PDF
    Programa Oficial de Doutoramento en ComputaciĂłn . 5009V01[Abstract] Medical imaging plays a prominent role in modern clinical practice for numerous medical specialties. For instance, in ophthalmology, different imaging techniques are commonly used to visualize and study the eye fundus. In this context, automated image analysis methods are key towards facilitating the early diagnosis and adequate treatment of several diseases. Nowadays, deep learning algorithms have already demonstrated a remarkable performance for different image analysis tasks. However, these approaches typically require large amounts of annotated data for the training of deep neural networks. This complicates the adoption of deep learning approaches, especially in areas where large scale annotated datasets are harder to obtain, such as in medical imaging. This thesis aims to explore novel approaches for the automated analysis of medical images, particularly in ophthalmology. In this regard, the main focus is on the development of novel deep learning-based approaches that do not require large amounts of annotated training data and can be applied to high resolution images. For that purpose, we have presented a novel paradigm that allows to take advantage of unlabeled complementary image modalities for the training of deep neural networks. Additionally, we have also developed novel approaches for the detailed analysis of eye fundus images. In that regard, this thesis explores the analysis of relevant retinal structures as well as the diagnosis of different retinal diseases. In general, the developed algorithms provide satisfactory results for the analysis of the eye fundus, even when limited annotated training data is available.[Resumen] Las tĂ©cnicas de imagen tienen un papel destacado en la prĂĄctica clĂ­nica moderna de numerosas especialidades mĂ©dicas. Por ejemplo, en oftalmologĂ­a es comĂșn el uso de diferentes tĂ©cnicas de imagen para visualizar y estudiar el fondo de ojo. En este contexto, los mĂ©todos automĂĄticos de anĂĄlisis de imagen son clave para facilitar el diagnĂłstico precoz y el tratamiento adecuado de diversas enfermedades. En la actualidad, los algoritmos de aprendizaje profundo ya han demostrado un notable rendimiento en diferentes tareas de anĂĄlisis de imagen. Sin embargo, estos mĂ©todos suelen necesitar grandes cantidades de datos etiquetados para el entrenamiento de las redes neuronales profundas. Esto complica la adopciĂłn de los mĂ©todos de aprendizaje profundo, especialmente en ĂĄreas donde los conjuntos masivos de datos etiquetados son mĂĄs difĂ­ciles de obtener, como es el caso de la imagen mĂ©dica. Esta tesis tiene como objetivo explorar nuevos mĂ©todos para el anĂĄlisis automĂĄtico de imagen mĂ©dica, concretamente en oftalmologĂ­a. En este sentido, el foco principal es el desarrollo de nuevos mĂ©todos basados en aprendizaje profundo que no requieran grandes cantidades de datos etiquetados para el entrenamiento y puedan aplicarse a imĂĄgenes de alta resoluciĂłn. Para ello, hemos presentado un nuevo paradigma que permite aprovechar modalidades de imagen complementarias no etiquetadas para el entrenamiento de redes neuronales profundas. AdemĂĄs, tambiĂ©n hemos desarrollado nuevos mĂ©todos para el anĂĄlisis en detalle de las imĂĄgenes del fondo de ojo. En este sentido, esta tesis explora el anĂĄlisis de estructuras retinianas relevantes, asĂ­ como el diagnĂłstico de diferentes enfermedades de la retina. En general, los algoritmos desarrollados proporcionan resultados satisfactorios para el anĂĄlisis de las imĂĄgenes de fondo de ojo, incluso cuando la disponibilidad de datos de entrenamiento etiquetados es limitada.[Resumo] As tĂ©cnicas de imaxe teñen un papel destacado na prĂĄctica clĂ­nica moderna de numerosas especialidades mĂ©dicas. Por exemplo, en oftalmoloxĂ­a Ă© comĂșn o uso de diferentes tĂ©cnicas de imaxe para visualizar e estudar o fondo de ollo. Neste contexto, os mĂ©todos automĂĄticos de anĂĄlises de imaxe son clave para facilitar o diagn ostico precoz e o tratamento adecuado de diversas enfermidades. Na actualidade, os algoritmos de aprendizaxe profunda xa demostraron un notable rendemento en diferentes tarefas de anĂĄlises de imaxe. Con todo, estes mĂ©todos adoitan necesitar grandes cantidades de datos etiquetos para o adestramento das redes neuronais profundas. Isto complica a adopciĂłn dos mĂ©todos de aprendizaxe profunda, especialmente en ĂĄreas onde os conxuntos masivos de datos etiquetados son mĂĄis difĂ­ciles de obter, como Ă© o caso da imaxe mĂ©dica. Esta tese ten como obxectivo explorar novos mĂ©todos para a anĂĄlise automĂĄtica de imaxe mĂ©dica, concretamente en oftalmoloxĂ­a. Neste sentido, o foco principal Ă© o desenvolvemento de novos mĂ©todos baseados en aprendizaxe profunda que non requiran grandes cantidades de datos etiquetados para o adestramento e poidan aplicarse a imaxes de alta resoluciĂłn. Para iso, presentamos un novo paradigma que permite aproveitar modalidades de imaxe complementarias non etiquetadas para o adestramento de redes neuronais profundas. Ademais, tamĂ©n desenvolvemos novos mĂ©todos para a anĂĄlise en detalle das imaxes do fondo de ollo. Neste sentido, esta tese explora a anĂĄlise de estruturas retinianas relevantes, asĂ­ como o diagnĂłstico de diferentes enfermidades da retina. En xeral, os algoritmos desenvolvidos proporcionan resultados satisfactorios para a anĂĄlise das imaxes de fondo de ollo, mesmo cando a dispoñibilidade de datos de adestramento etiquetados Ă© limitada

    Automated Analysis of Retinal and Choroidal OCT and OCTA Images in AMD

    Get PDF
    La dĂ©gĂ©nĂ©rescence maculaire liĂ©e Ă  l'Ăąge (DMLA) est une maladie oculaire progressive qui se manifeste principalement au niveau de la rĂ©tine externe et de la choroĂŻde. Le projet de recherche vise Ă  dĂ©terminer si des mesures obtenues Ă  partir d'images de tomographie par cohĂ©rence optique (OCT) et d'angiographie OCT (OCTA) peuvent ĂȘtre utilisĂ©es afin de fournir de nouvelles informations sur des biomarqueurs de la DMLA, ainsi qu’une mĂ©thode de dĂ©tection prĂ©coce de la maladie. À cette fin, un appareil permettant l’OCT et l’OCTA a Ă©tĂ© utilisĂ© pour imager des sujets DMLA prĂ©coces et intermĂ©diaires, et des sujets tĂ©moins. À la configuration sĂ©lectionnĂ©e de l’appareil OCT, chaque acquisition d'un Ɠil fournit un volume de donnĂ©es qui est constituĂ© de 300 images transversales appelĂ©es B-scan. Au total, des acquisitions de 10 yeux de sujets atteints de DMLA prĂ©coce et intermĂ©diaire (3000 images B-scan) et un cas de DMLA nĂ©ovasculaire, 12 yeux de sujets ĂągĂ©s de plus de 50 ans (3600 images B-scan) et 11 yeux de sujets ĂągĂ©s de moins de 50 ans (3300 images B-scan) ont Ă©tĂ© obtenues. Cinq mĂ©thodes d'extraction de caractĂ©ristiques ont Ă©tĂ© reproduites ou dĂ©veloppĂ©es afin de dĂ©terminer si des diffĂ©rences significatives au niveau de l’Ɠil pouvaient ĂȘtre observĂ©es entre les sujets atteints de DMLA prĂ©coce et intermĂ©diaire et les sujets tĂ©moins d’ñge similaire. GrĂące Ă  des tests non paramĂ©triques, il a Ă©tĂ© Ă©tabli que deux mĂ©thodes connues d'extraction de biomarqueurs de la DMLA (analyse d’absence de signal de dĂ©bit sanguin au niveau de la choriocapillaire et une mĂ©thode de segmentation des drusen) produisent des mesures qui montrent des diffĂ©rences significatives entre les groupes, et qui sont reprĂ©sentĂ©es de façon uniforme Ă  travers le plan frontal de l’Ɠil. Il a ensuite Ă©tĂ© souhaitĂ© de tirer parti des mesures et de gĂ©nĂ©rer un modĂšle de classification de la DMLA interprĂ©table basĂ© sur l'apprentissage automatique au niveau des B-scans. Des spectres de frĂ©quence rĂ©sultant de la transformĂ© de Fourier rapide de sĂ©ries spatiales dĂ©rivĂ©es de mesures considĂ©rĂ©es comme reprĂ©sentatives des deux biomarqueurs ont Ă©tĂ© obtenues, et utilisĂ©es comme caractĂ©ristiques pour former un classifieur de type forĂȘt alĂ©atoire et un classifieur de type forĂȘt profonde. L'analyse en composantes principales (PCA) a Ă©tĂ© utilisĂ©e pour rĂ©duire la dimensionnalitĂ© de l’espace des caractĂ©ristiques, et la performance des modĂšles et l'importance des prĂ©dicteurs ont Ă©tĂ© Ă©valuĂ©es. Une nouvelle mĂ©thode a Ă©tĂ© conçue qui permet une reconstruction 3D automatisĂ©e et une Ă©valuation quantitative de la structure des signaux OCTA et ainsi des vaisseaux rĂ©tiniens. Des mesures reprĂ©sentatives des drusen et de la choriocapillaire ont Ă©tĂ© utilisĂ©es pour crĂ©er des modĂšles interprĂ©tables pour la classification de la DMLA prĂ©coce et intermĂ©diaire. Alors que la prĂ©valence mondiale de la DMLA augmente et que les appareils OCT deviennent plus disponibles, un plus grand nombre de personnes hautement qualifiĂ©es est nĂ©cessaire pour interprĂ©ter les informations mĂ©dicales et fournir les soins cliniques appropriĂ©s. L'analyse et le classement du niveau de sĂ©vĂ©ritĂ© de la DMLA par des experts par le biais d'images OCT sont coĂ»teux et prennent du temps. Les modĂšles proposĂ©s pourraient servir Ă  automatiser la dĂ©tection de la DMLA, mĂȘme lorsqu'elle est asymptomatique, et signaler Ă  un ophtalmologue la nĂ©cessitĂ© de surveiller et de traiter la condition avant la survenue de pertes graves de la vision. Les modĂšles sont transparents et sont en mesure de fournir une classification Ă  partir d’une seule image transversale. Par consĂ©quent, l'outil diagnostic automatisĂ© pourrait Ă©galement ĂȘtre utilisĂ© dans des situations oĂč seules des donnĂ©es mĂ©dicales partielles sont disponibles ou lorsque l'accĂšs aux ressources de soins de santĂ© est limitĂ©.----------ABSTRACT Age-related macular degeneration (AMD) is a progressive eye disease which manifests primarily at the outer retina and choroid. The research project aimed to determine whether measures obtained from optical coherence tomography (OCT) and OCT angiography (OCTA) images could be used to provide novel AMD biomarker insight and an early disease detection method. To that end, an OCT and OCTA enabled device was used to image AMD subjects and controls. At the selected device scan size, each scan of one eye gathered using an OCT device provides a volume of data which is constructed of 300 cross-sectional images termed B-scans. In total, scans of 10 eyes from subjects with early and intermediate AMD (3,000 B-scan images) and a case of neovascular AMD, 12 eyes from subjects over the age of 50 years old (3,600 B-scan images), and 11 eyes from subjects under the age of 50 years old (3,300 B-scan images) were obtained. Five feature extraction methods were either reproduced or developed in order to determine if significant differences could be observed between the early and intermediate AMD subjects and control subjects at the eye level. Through non-parametric testing it was established that two AMD biomarker extraction methods (choriocapillaris flow voids analysis and a drusen segmentation method) produced measures which showed significant differences between groups, and which were also uniformly represented across the frontal plane of the eye. It was then desired to leverage the measures and generate a B-scan level, interpretable machine learning-based AMD classification model. Frequency spectrums resulting from the fast Fourier transforms of spatial series derived from measures believed to be representative of the two biomarkers were obtained and used as features to train a random forest and a deep forest classifier. Principal component analysis was used to reduce dimensionality of the feature space, and model performance and predictor importance were assessed. A new method was devised which allows automated 3D reconstruction and quantitative evaluation of retinal flow signal patterns and incidentally of retinal microvasculature. Measures representative of drusen and choriocapillaris were leveraged to create interpretable models for the classification of early and intermediate AMD. As the worldwide prevalence of AMD increases and OCT devices are becoming more available, a greater number of highly trained personnel is needed to interpret medical information and provide the appropriate clinical care. Expert analysis and grading of AMD through OCT images are expensive and time consuming. The models proposed could serve to automate AMD detection, even when it is asymptomatic, and signal to an ophthalmologist the need to monitor and treat the condition before the occurrence of severe visual loss. The models are transparent and provide classification from single cross-sectional images. Therefore, the automated diagnosis tool could also be used in situations where only partial medical data are available, or where there is limited access to health care resources

    Advanced Representation Learning for Dense Prediction Tasks in Medical Image Analysis

    Get PDF
    Machine learning is a rapidly growing field of artificial intelligence that allows computers to learn and make predictions using human labels. However, traditional machine learning methods have many drawbacks, such as being time-consuming, inefficient, task-specific biased, and requiring a large amount of domain knowledge. A subfield of machine learning, representation learning, focuses on learning meaningful and useful features or representations from input data. It aims to automatically learn relevant features from raw data, saving time, increasing efficiency and generalization, and reducing reliance on expert knowledge. Recently, deep learning has further accelerated the development of representation learning. It leverages deep architectures to extract complex and abstract representations, resulting in significant outperformance in many areas. In the field of computer vision, deep learning has made remarkable progress, particularly in high-level and real-world computer vision tasks. Since deep learning methods do not require handcrafted features and have the ability to understand complex visual information, they facilitate researchers to design automated systems that make accurate diagnoses and interpretations, especially in the field of medical image analysis. Deep learning has achieved state-of-the-art performance in many medical image analysis tasks, such as medical image regression/classification, generation and segmentation tasks. Compared to regression/classification tasks, medical image generation and segmentation tasks are more complex dense prediction tasks that understand semantic representations and generate pixel-level predictions. This thesis focuses on designing representation learning methods to improve the performance of dense prediction tasks in the field of medical image analysis. With advances in imaging technology, more complex medical images become available for use in this field. In contrast to traditional machine learning algorithms, current deep learning-based representation learning methods provide an end-to-end approach to automatically extract representations without the need for manual feature engineering from the complex data. In the field of medical image analysis, there are three unique challenges requiring the design of advanced representation learning architectures, \ie, limited labeled medical images, overfitting with limited data, and lack of interpretability. To address these challenges, we aim to design robust representation learning architectures for the two main directions of dense prediction tasks, namely medical image generation and segmentation. For medical image generation, the specific topic that we focus on is chromosome straightening. This task involves generating a straightened chromosome image from a curved chromosome input. In addition, the challenges of this task include insufficient training images and corresponding ground truth, as well as the non-rigid nature of chromosomes, leading to distorted details and shapes after straightening. We first propose a study for the chromosome straightening task. We introduce a novel framework using image-to-image translation and demonstrate its efficacy and robustness in generating straightened chromosomes. The framework addresses the challenges of limited training data and outperforms existing studies. We then present a subsequent study to address the limitations of our previous framework, resulting in new state-of-the-art performance and better interpretability and generalization capability. We propose a new robust chromosome straightening framework, named Vit-Patch GAN, which instead learns the motion representation of chromosomes for straightening while retaining more details of shape and banding patterns. For medical image segmentation, we focus on the fovea localization task, which is transferred from localization to small region segmentation. Accurate segmentation of the fovea region is crucial for monitoring and analyzing retinal diseases to prevent irreversible vision loss. This task also requires the incorporation of global features to effectively identify the fovea region and overcome hard cases associated with retinal diseases and non-standard fovea locations. We first propose a novel two-branch architecture, Bilateral-ViT, for fovea localization in retina image segmentation. This vision-transformer-based architecture incorporates global image context and blood vessel structure. It surpasses existing methods and achieves state-of-the-art results on two public datasets. We then propose a subsequent method to further improve the performance of fovea localization. We design a novel dual-stream deep learning architecture called Bilateral-Fuser. In contrast to our previous Bilateral-ViT, Bilateral-Fuser globally incorporates long-range connections from multiple cues, including fundus and vessel distribution. Moreover, with the newly designed Bilateral Token Incorporation module, Bilateral-Fuser learns anatomical-aware tokens, significantly reducing computational costs while achieving new state-of-the-art performance. Our comprehensive experiments also demonstrate that Bilateral-Fuser achieves better accuracy and robustness on both normal and diseased retina images, with excellent generalization capability
    corecore