838 research outputs found

    An In-Depth Statistical Review of Retinal Image Processing Models from a Clinical Perspective

    Get PDF
    The burgeoning field of retinal image processing is critical in facilitating early diagnosis and treatment of retinal diseases, which are amongst the leading causes of vision impairment globally. Despite rapid advancements, existing machine learning models for retinal image processing are characterized by significant limitations, including disparities in pre-processing, segmentation, and classification methodologies, as well as inconsistencies in post-processing operations. These limitations hinder the realization of accurate, reliable, and clinically relevant outcomes. This paper provides an in-depth statistical review of extant machine learning models used in retinal image processing, meticulously comparing them based on their internal operating characteristics and performance levels. By adopting a robust analytical approach, our review delineates the strengths and weaknesses of current models, offering comprehensive insights that are instrumental in guiding future research and development in this domain. Furthermore, this review underscores the potential clinical impacts of these models, highlighting their pivotal role in enhancing diagnostic accuracy, prognostic assessments, and therapeutic interventions for retinal disorders. In conclusion, our work not only bridges the existing knowledge gap in the literature but also paves the way for the evolution of more sophisticated and clinically-aligned retinal image processing models, ultimately contributing to improved patient outcomes and advancements in ophthalmic care

    Blended Multi-Modal Deep ConvNet Features for Diabetic Retinopathy Severity Prediction

    Full text link
    Diabetic Retinopathy (DR) is one of the major causes of visual impairment and blindness across the world. It is usually found in patients who suffer from diabetes for a long period. The major focus of this work is to derive optimal representation of retinal images that further helps to improve the performance of DR recognition models. To extract optimal representation, features extracted from multiple pre-trained ConvNet models are blended using proposed multi-modal fusion module. These final representations are used to train a Deep Neural Network (DNN) used for DR identification and severity level prediction. As each ConvNet extracts different features, fusing them using 1D pooling and cross pooling leads to better representation than using features extracted from a single ConvNet. Experimental studies on benchmark Kaggle APTOS 2019 contest dataset reveals that the model trained on proposed blended feature representations is superior to the existing methods. In addition, we notice that cross average pooling based fusion of features from Xception and VGG16 is the most appropriate for DR recognition. With the proposed model, we achieve an accuracy of 97.41%, and a kappa statistic of 94.82 for DR identification and an accuracy of 81.7% and a kappa statistic of 71.1% for severity level prediction. Another interesting observation is that DNN with dropout at input layer converges more quickly when trained using blended features, compared to the same model trained using uni-modal deep features.Comment: 18 pages, 8 figures, published in Electronics MDPI journa

    A Survey on Deep Learning in Medical Image Analysis

    Full text link
    Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.Comment: Revised survey includes expanded discussion section and reworked introductory section on common deep architectures. Added missed papers from before Feb 1st 201

    Deep Learning Techniques for Automated Analysis and Processing of High Resolution Medical Imaging

    Get PDF
    Programa Oficial de Doutoramento en Computación . 5009V01[Abstract] Medical imaging plays a prominent role in modern clinical practice for numerous medical specialties. For instance, in ophthalmology, different imaging techniques are commonly used to visualize and study the eye fundus. In this context, automated image analysis methods are key towards facilitating the early diagnosis and adequate treatment of several diseases. Nowadays, deep learning algorithms have already demonstrated a remarkable performance for different image analysis tasks. However, these approaches typically require large amounts of annotated data for the training of deep neural networks. This complicates the adoption of deep learning approaches, especially in areas where large scale annotated datasets are harder to obtain, such as in medical imaging. This thesis aims to explore novel approaches for the automated analysis of medical images, particularly in ophthalmology. In this regard, the main focus is on the development of novel deep learning-based approaches that do not require large amounts of annotated training data and can be applied to high resolution images. For that purpose, we have presented a novel paradigm that allows to take advantage of unlabeled complementary image modalities for the training of deep neural networks. Additionally, we have also developed novel approaches for the detailed analysis of eye fundus images. In that regard, this thesis explores the analysis of relevant retinal structures as well as the diagnosis of different retinal diseases. In general, the developed algorithms provide satisfactory results for the analysis of the eye fundus, even when limited annotated training data is available.[Resumen] Las técnicas de imagen tienen un papel destacado en la práctica clínica moderna de numerosas especialidades médicas. Por ejemplo, en oftalmología es común el uso de diferentes técnicas de imagen para visualizar y estudiar el fondo de ojo. En este contexto, los métodos automáticos de análisis de imagen son clave para facilitar el diagnóstico precoz y el tratamiento adecuado de diversas enfermedades. En la actualidad, los algoritmos de aprendizaje profundo ya han demostrado un notable rendimiento en diferentes tareas de análisis de imagen. Sin embargo, estos métodos suelen necesitar grandes cantidades de datos etiquetados para el entrenamiento de las redes neuronales profundas. Esto complica la adopción de los métodos de aprendizaje profundo, especialmente en áreas donde los conjuntos masivos de datos etiquetados son más difíciles de obtener, como es el caso de la imagen médica. Esta tesis tiene como objetivo explorar nuevos métodos para el análisis automático de imagen médica, concretamente en oftalmología. En este sentido, el foco principal es el desarrollo de nuevos métodos basados en aprendizaje profundo que no requieran grandes cantidades de datos etiquetados para el entrenamiento y puedan aplicarse a imágenes de alta resolución. Para ello, hemos presentado un nuevo paradigma que permite aprovechar modalidades de imagen complementarias no etiquetadas para el entrenamiento de redes neuronales profundas. Además, también hemos desarrollado nuevos métodos para el análisis en detalle de las imágenes del fondo de ojo. En este sentido, esta tesis explora el análisis de estructuras retinianas relevantes, así como el diagnóstico de diferentes enfermedades de la retina. En general, los algoritmos desarrollados proporcionan resultados satisfactorios para el análisis de las imágenes de fondo de ojo, incluso cuando la disponibilidad de datos de entrenamiento etiquetados es limitada.[Resumo] As técnicas de imaxe teñen un papel destacado na práctica clínica moderna de numerosas especialidades médicas. Por exemplo, en oftalmoloxía é común o uso de diferentes técnicas de imaxe para visualizar e estudar o fondo de ollo. Neste contexto, os métodos automáticos de análises de imaxe son clave para facilitar o diagn ostico precoz e o tratamento adecuado de diversas enfermidades. Na actualidade, os algoritmos de aprendizaxe profunda xa demostraron un notable rendemento en diferentes tarefas de análises de imaxe. Con todo, estes métodos adoitan necesitar grandes cantidades de datos etiquetos para o adestramento das redes neuronais profundas. Isto complica a adopción dos métodos de aprendizaxe profunda, especialmente en áreas onde os conxuntos masivos de datos etiquetados son máis difíciles de obter, como é o caso da imaxe médica. Esta tese ten como obxectivo explorar novos métodos para a análise automática de imaxe médica, concretamente en oftalmoloxía. Neste sentido, o foco principal é o desenvolvemento de novos métodos baseados en aprendizaxe profunda que non requiran grandes cantidades de datos etiquetados para o adestramento e poidan aplicarse a imaxes de alta resolución. Para iso, presentamos un novo paradigma que permite aproveitar modalidades de imaxe complementarias non etiquetadas para o adestramento de redes neuronais profundas. Ademais, tamén desenvolvemos novos métodos para a análise en detalle das imaxes do fondo de ollo. Neste sentido, esta tese explora a análise de estruturas retinianas relevantes, así como o diagnóstico de diferentes enfermidades da retina. En xeral, os algoritmos desenvolvidos proporcionan resultados satisfactorios para a análise das imaxes de fondo de ollo, mesmo cando a dispoñibilidade de datos de adestramento etiquetados é limitada

    Learnable Ophthalmology SAM

    Full text link
    Segmentation is vital for ophthalmology image analysis. But its various modal images hinder most of the existing segmentation algorithms applications, as they rely on training based on a large number of labels or hold weak generalization ability. Based on Segment Anything (SAM), we propose a simple but effective learnable prompt layer suitable for multiple target segmentation in ophthalmology multi-modal images, named Learnable Ophthalmology Segment Anything (SAM). The learnable prompt layer learns medical prior knowledge from each transformer layer. During training, we only train the prompt layer and task head based on a one-shot mechanism. We demonstrate the effectiveness of our thought based on four medical segmentation tasks based on nine publicly available datasets. Moreover, we only provide a new improvement thought for applying the existing fundamental CV models in the medical field. Our codes are available at \href{https://github.com/Qsingle/LearnablePromptSAM}{website}

    Multi-branch Convolutional Neural Network for Multiple Sclerosis Lesion Segmentation

    Get PDF
    In this paper, we present an automated approach for segmenting multiple sclerosis (MS) lesions from multi-modal brain magnetic resonance images. Our method is based on a deep end-to-end 2D convolutional neural network (CNN) for slice-based segmentation of 3D volumetric data. The proposed CNN includes a multi-branch downsampling path, which enables the network to encode information from multiple modalities separately. Multi-scale feature fusion blocks are proposed to combine feature maps from different modalities at different stages of the network. Then, multi-scale feature upsampling blocks are introduced to upsize combined feature maps to leverage information from lesion shape and location. We trained and tested the proposed model using orthogonal plane orientations of each 3D modality to exploit the contextual information in all directions. The proposed pipeline is evaluated on two different datasets: a private dataset including 37 MS patients and a publicly available dataset known as the ISBI 2015 longitudinal MS lesion segmentation challenge dataset, consisting of 14 MS patients. Considering the ISBI challenge, at the time of submission, our method was amongst the top performing solutions. On the private dataset, using the same array of performance metrics as in the ISBI challenge, the proposed approach shows high improvements in MS lesion segmentation compared with other publicly available tools.Comment: This paper has been accepted for publication in NeuroImag

    Is attention all you need in medical image analysis? A review

    Full text link
    Medical imaging is a key component in clinical diagnosis, treatment planning and clinical trial design, accounting for almost 90% of all healthcare data. CNNs achieved performance gains in medical image analysis (MIA) over the last years. CNNs can efficiently model local pixel interactions and be trained on small-scale MI data. The main disadvantage of typical CNN models is that they ignore global pixel relationships within images, which limits their generalisation ability to understand out-of-distribution data with different 'global' information. The recent progress of Artificial Intelligence gave rise to Transformers, which can learn global relationships from data. However, full Transformer models need to be trained on large-scale data and involve tremendous computational complexity. Attention and Transformer compartments (Transf/Attention) which can well maintain properties for modelling global relationships, have been proposed as lighter alternatives of full Transformers. Recently, there is an increasing trend to co-pollinate complementary local-global properties from CNN and Transf/Attention architectures, which led to a new era of hybrid models. The past years have witnessed substantial growth in hybrid CNN-Transf/Attention models across diverse MIA problems. In this systematic review, we survey existing hybrid CNN-Transf/Attention models, review and unravel key architectural designs, analyse breakthroughs, and evaluate current and future opportunities as well as challenges. We also introduced a comprehensive analysis framework on generalisation opportunities of scientific and clinical impact, based on which new data-driven domain generalisation and adaptation methods can be stimulated

    Self-Supervised Multimodal Reconstruction of Retinal Images Over Paired Datasets

    Get PDF
    [Abstract] Data scarcity represents an important constraint for the training of deep neural networks in medical imaging. Medical image labeling, especially if pixel-level annotations are required, is an expensive task that needs expert intervention and usually results in a reduced number of annotated samples. In contrast, extensive amounts of unlabeled data are produced in the daily clinical practice, including paired multimodal images from patients that were subjected to multiple imaging tests. This work proposes a novel self-supervised multimodal reconstruction task that takes advantage of this unlabeled multimodal data for learning about the domain without human supervision. Paired multimodal data is a rich source of clinical information that can be naturally exploited by trying to estimate one image modality from others. This multimodal reconstruction requires the recognition of domain-specific patterns that can be used to complement the training of image analysis tasks in the same domain for which annotated data is scarce. In this work, a set of experiments is performed using a multimodal setting of retinography and fluorescein angiography pairs that offer complementary information about the eye fundus. The evaluations performed on different public datasets, which include pathological and healthy data samples, demonstrate that a network trained for self-supervised multimodal reconstruction of angiography from retinography achieves unsupervised recognition of important retinal structures. These results indicate that the proposed self-supervised task provides relevant cues for image analysis tasks in the same domain.This work is supported by Instituto de Salud Carlos III, Government of Spain, and the European Regional Development Fund (ERDF) of the European Union (EU) through the DTS18/00136 research project, and by Ministerio de Economía, Industria y Competitividad, Government of Spain, through the DPI2015-69948-R research project. The authors of this work also receive financial support from the ERDF and Xunta de Galicia through Grupo de Referencia Competitiva, Ref. ED431C 2016-047, and from the European Social Fund (ESF) of the EU and Xunta de Galicia through the predoctoral grant contract Ref. ED481A-2017/328. CITIC, Centro de Investigación de Galicia Ref. ED431G 2019/01, receives financial support from Consellería de Educación, Universidade e Formación Profesional, Xunta de Galicia, through the ERDF (80%) and Secretaría Xeral de Universidades (20%)Xunta de Galicia; ED431C 2016-047Xunta de Galicia; ED481A-2017/328Xunta de Galicia; ED431G 2019/0
    corecore