838 research outputs found
An In-Depth Statistical Review of Retinal Image Processing Models from a Clinical Perspective
The burgeoning field of retinal image processing is critical in facilitating early diagnosis and treatment of retinal diseases, which are amongst the leading causes of vision impairment globally. Despite rapid advancements, existing machine learning models for retinal image processing are characterized by significant limitations, including disparities in pre-processing, segmentation, and classification methodologies, as well as inconsistencies in post-processing operations. These limitations hinder the realization of accurate, reliable, and clinically relevant outcomes. This paper provides an in-depth statistical review of extant machine learning models used in retinal image processing, meticulously comparing them based on their internal operating characteristics and performance levels. By adopting a robust analytical approach, our review delineates the strengths and weaknesses of current models, offering comprehensive insights that are instrumental in guiding future research and development in this domain. Furthermore, this review underscores the potential clinical impacts of these models, highlighting their pivotal role in enhancing diagnostic accuracy, prognostic assessments, and therapeutic interventions for retinal disorders. In conclusion, our work not only bridges the existing knowledge gap in the literature but also paves the way for the evolution of more sophisticated and clinically-aligned retinal image processing models, ultimately contributing to improved patient outcomes and advancements in ophthalmic care
Blended Multi-Modal Deep ConvNet Features for Diabetic Retinopathy Severity Prediction
Diabetic Retinopathy (DR) is one of the major causes of visual impairment and
blindness across the world. It is usually found in patients who suffer from
diabetes for a long period. The major focus of this work is to derive optimal
representation of retinal images that further helps to improve the performance
of DR recognition models. To extract optimal representation, features extracted
from multiple pre-trained ConvNet models are blended using proposed multi-modal
fusion module. These final representations are used to train a Deep Neural
Network (DNN) used for DR identification and severity level prediction. As each
ConvNet extracts different features, fusing them using 1D pooling and cross
pooling leads to better representation than using features extracted from a
single ConvNet. Experimental studies on benchmark Kaggle APTOS 2019 contest
dataset reveals that the model trained on proposed blended feature
representations is superior to the existing methods. In addition, we notice
that cross average pooling based fusion of features from Xception and VGG16 is
the most appropriate for DR recognition. With the proposed model, we achieve an
accuracy of 97.41%, and a kappa statistic of 94.82 for DR identification and an
accuracy of 81.7% and a kappa statistic of 71.1% for severity level prediction.
Another interesting observation is that DNN with dropout at input layer
converges more quickly when trained using blended features, compared to the
same model trained using uni-modal deep features.Comment: 18 pages, 8 figures, published in Electronics MDPI journa
A Survey on Deep Learning in Medical Image Analysis
Deep learning algorithms, in particular convolutional networks, have rapidly
become a methodology of choice for analyzing medical images. This paper reviews
the major deep learning concepts pertinent to medical image analysis and
summarizes over 300 contributions to the field, most of which appeared in the
last year. We survey the use of deep learning for image classification, object
detection, segmentation, registration, and other tasks and provide concise
overviews of studies per application area. Open challenges and directions for
future research are discussed.Comment: Revised survey includes expanded discussion section and reworked
introductory section on common deep architectures. Added missed papers from
before Feb 1st 201
Deep Learning Techniques for Automated Analysis and Processing of High Resolution Medical Imaging
Programa Oficial de Doutoramento en Computación . 5009V01[Abstract]
Medical imaging plays a prominent role in modern clinical practice for numerous
medical specialties. For instance, in ophthalmology, different imaging techniques are
commonly used to visualize and study the eye fundus. In this context, automated
image analysis methods are key towards facilitating the early diagnosis and adequate
treatment of several diseases. Nowadays, deep learning algorithms have already
demonstrated a remarkable performance for different image analysis tasks. However,
these approaches typically require large amounts of annotated data for the training
of deep neural networks. This complicates the adoption of deep learning approaches,
especially in areas where large scale annotated datasets are harder to obtain, such
as in medical imaging.
This thesis aims to explore novel approaches for the automated analysis of medical
images, particularly in ophthalmology. In this regard, the main focus is on
the development of novel deep learning-based approaches that do not require large
amounts of annotated training data and can be applied to high resolution images.
For that purpose, we have presented a novel paradigm that allows to take advantage
of unlabeled complementary image modalities for the training of deep neural
networks. Additionally, we have also developed novel approaches for the detailed
analysis of eye fundus images. In that regard, this thesis explores the analysis of
relevant retinal structures as well as the diagnosis of different retinal diseases. In
general, the developed algorithms provide satisfactory results for the analysis of the
eye fundus, even when limited annotated training data is available.[Resumen]
Las técnicas de imagen tienen un papel destacado en la práctica clínica moderna
de numerosas especialidades médicas. Por ejemplo, en oftalmología es común el uso
de diferentes técnicas de imagen para visualizar y estudiar el fondo de ojo. En este
contexto, los métodos automáticos de análisis de imagen son clave para facilitar
el diagnóstico precoz y el tratamiento adecuado de diversas enfermedades. En la
actualidad, los algoritmos de aprendizaje profundo ya han demostrado un notable
rendimiento en diferentes tareas de análisis de imagen. Sin embargo, estos métodos
suelen necesitar grandes cantidades de datos etiquetados para el entrenamiento de
las redes neuronales profundas. Esto complica la adopción de los métodos de aprendizaje
profundo, especialmente en áreas donde los conjuntos masivos de datos etiquetados
son más difíciles de obtener, como es el caso de la imagen médica.
Esta tesis tiene como objetivo explorar nuevos métodos para el análisis automático de imagen médica, concretamente en oftalmología. En este sentido, el foco
principal es el desarrollo de nuevos métodos basados en aprendizaje profundo que no
requieran grandes cantidades de datos etiquetados para el entrenamiento y puedan
aplicarse a imágenes de alta resolución. Para ello, hemos presentado un nuevo
paradigma que permite aprovechar modalidades de imagen complementarias no etiquetadas
para el entrenamiento de redes neuronales profundas. Además, también
hemos desarrollado nuevos métodos para el análisis en detalle de las imágenes del
fondo de ojo. En este sentido, esta tesis explora el análisis de estructuras retinianas
relevantes, así como el diagnóstico de diferentes enfermedades de la retina. En
general, los algoritmos desarrollados proporcionan resultados satisfactorios para el
análisis de las imágenes de fondo de ojo, incluso cuando la disponibilidad de datos
de entrenamiento etiquetados es limitada.[Resumo]
As técnicas de imaxe teñen un papel destacado na práctica clínica moderna de
numerosas especialidades médicas. Por exemplo, en oftalmoloxía é común o uso
de diferentes técnicas de imaxe para visualizar e estudar o fondo de ollo. Neste
contexto, os métodos automáticos de análises de imaxe son clave para facilitar o
diagn ostico precoz e o tratamento adecuado de diversas enfermidades. Na actualidade,
os algoritmos de aprendizaxe profunda xa demostraron un notable rendemento
en diferentes tarefas de análises de imaxe. Con todo, estes métodos adoitan necesitar
grandes cantidades de datos etiquetos para o adestramento das redes neuronais
profundas. Isto complica a adopción dos métodos de aprendizaxe profunda, especialmente
en áreas onde os conxuntos masivos de datos etiquetados son máis difíciles
de obter, como é o caso da imaxe médica.
Esta tese ten como obxectivo explorar novos métodos para a análise automática
de imaxe médica, concretamente en oftalmoloxía. Neste sentido, o foco principal
é o desenvolvemento de novos métodos baseados en aprendizaxe profunda que non
requiran grandes cantidades de datos etiquetados para o adestramento e poidan aplicarse
a imaxes de alta resolución. Para iso, presentamos un novo paradigma que
permite aproveitar modalidades de imaxe complementarias non etiquetadas para o
adestramento de redes neuronais profundas. Ademais, tamén desenvolvemos novos
métodos para a análise en detalle das imaxes do fondo de ollo. Neste sentido, esta
tese explora a análise de estruturas retinianas relevantes, así como o diagnóstico de
diferentes enfermidades da retina. En xeral, os algoritmos desenvolvidos proporcionan
resultados satisfactorios para a análise das imaxes de fondo de ollo, mesmo
cando a dispoñibilidade de datos de adestramento etiquetados é limitada
Learnable Ophthalmology SAM
Segmentation is vital for ophthalmology image analysis. But its various modal
images hinder most of the existing segmentation algorithms applications, as
they rely on training based on a large number of labels or hold weak
generalization ability. Based on Segment Anything (SAM), we propose a simple
but effective learnable prompt layer suitable for multiple target segmentation
in ophthalmology multi-modal images, named Learnable Ophthalmology Segment
Anything (SAM). The learnable prompt layer learns medical prior knowledge from
each transformer layer. During training, we only train the prompt layer and
task head based on a one-shot mechanism. We demonstrate the effectiveness of
our thought based on four medical segmentation tasks based on nine publicly
available datasets. Moreover, we only provide a new improvement thought for
applying the existing fundamental CV models in the medical field. Our codes are
available at \href{https://github.com/Qsingle/LearnablePromptSAM}{website}
Multi-branch Convolutional Neural Network for Multiple Sclerosis Lesion Segmentation
In this paper, we present an automated approach for segmenting multiple
sclerosis (MS) lesions from multi-modal brain magnetic resonance images. Our
method is based on a deep end-to-end 2D convolutional neural network (CNN) for
slice-based segmentation of 3D volumetric data. The proposed CNN includes a
multi-branch downsampling path, which enables the network to encode information
from multiple modalities separately. Multi-scale feature fusion blocks are
proposed to combine feature maps from different modalities at different stages
of the network. Then, multi-scale feature upsampling blocks are introduced to
upsize combined feature maps to leverage information from lesion shape and
location. We trained and tested the proposed model using orthogonal plane
orientations of each 3D modality to exploit the contextual information in all
directions. The proposed pipeline is evaluated on two different datasets: a
private dataset including 37 MS patients and a publicly available dataset known
as the ISBI 2015 longitudinal MS lesion segmentation challenge dataset,
consisting of 14 MS patients. Considering the ISBI challenge, at the time of
submission, our method was amongst the top performing solutions. On the private
dataset, using the same array of performance metrics as in the ISBI challenge,
the proposed approach shows high improvements in MS lesion segmentation
compared with other publicly available tools.Comment: This paper has been accepted for publication in NeuroImag
Is attention all you need in medical image analysis? A review
Medical imaging is a key component in clinical diagnosis, treatment planning
and clinical trial design, accounting for almost 90% of all healthcare data.
CNNs achieved performance gains in medical image analysis (MIA) over the last
years. CNNs can efficiently model local pixel interactions and be trained on
small-scale MI data. The main disadvantage of typical CNN models is that they
ignore global pixel relationships within images, which limits their
generalisation ability to understand out-of-distribution data with different
'global' information. The recent progress of Artificial Intelligence gave rise
to Transformers, which can learn global relationships from data. However, full
Transformer models need to be trained on large-scale data and involve
tremendous computational complexity. Attention and Transformer compartments
(Transf/Attention) which can well maintain properties for modelling global
relationships, have been proposed as lighter alternatives of full Transformers.
Recently, there is an increasing trend to co-pollinate complementary
local-global properties from CNN and Transf/Attention architectures, which led
to a new era of hybrid models. The past years have witnessed substantial growth
in hybrid CNN-Transf/Attention models across diverse MIA problems. In this
systematic review, we survey existing hybrid CNN-Transf/Attention models,
review and unravel key architectural designs, analyse breakthroughs, and
evaluate current and future opportunities as well as challenges. We also
introduced a comprehensive analysis framework on generalisation opportunities
of scientific and clinical impact, based on which new data-driven domain
generalisation and adaptation methods can be stimulated
Self-Supervised Multimodal Reconstruction of Retinal Images Over Paired Datasets
[Abstract]
Data scarcity represents an important constraint for the training of deep neural networks in medical imaging. Medical image labeling, especially if pixel-level annotations are required, is an expensive task that needs expert intervention and usually results in a reduced number of annotated samples. In contrast, extensive amounts of unlabeled data are produced in the daily clinical practice, including paired multimodal images from patients that were subjected to multiple imaging tests. This work proposes a novel self-supervised multimodal reconstruction task that takes advantage of this unlabeled multimodal data for learning about the domain without human supervision. Paired multimodal data is a rich source of clinical information that can be naturally exploited by trying to estimate one image modality from others. This multimodal reconstruction requires the recognition of domain-specific patterns that can be used to complement the training of image analysis tasks in the same domain for which annotated data is scarce.
In this work, a set of experiments is performed using a multimodal setting of retinography and fluorescein angiography pairs that offer complementary information about the eye fundus. The evaluations performed on different public datasets, which include pathological and healthy data samples, demonstrate that a network trained for self-supervised multimodal reconstruction of angiography from retinography achieves unsupervised recognition of important retinal structures. These results indicate that the proposed self-supervised task provides relevant cues for image analysis tasks in the same domain.This work is supported by Instituto de Salud Carlos III, Government of Spain, and the European Regional Development Fund (ERDF) of the European Union (EU) through the DTS18/00136 research project, and by Ministerio de Economía, Industria y Competitividad, Government of Spain, through the DPI2015-69948-R research project. The authors of this work also receive financial support from the ERDF and Xunta de Galicia through Grupo de Referencia Competitiva, Ref. ED431C 2016-047, and from the European Social Fund (ESF) of the EU and Xunta de Galicia through the predoctoral grant contract Ref. ED481A-2017/328. CITIC, Centro de Investigación de Galicia Ref. ED431G 2019/01, receives financial support from Consellería de Educación, Universidade e Formación Profesional, Xunta de Galicia, through the ERDF (80%) and Secretaría Xeral de Universidades (20%)Xunta de Galicia; ED431C 2016-047Xunta de Galicia; ED481A-2017/328Xunta de Galicia; ED431G 2019/0
- …