372 research outputs found
A Survey on Deep Learning in Medical Image Analysis
Deep learning algorithms, in particular convolutional networks, have rapidly
become a methodology of choice for analyzing medical images. This paper reviews
the major deep learning concepts pertinent to medical image analysis and
summarizes over 300 contributions to the field, most of which appeared in the
last year. We survey the use of deep learning for image classification, object
detection, segmentation, registration, and other tasks and provide concise
overviews of studies per application area. Open challenges and directions for
future research are discussed.Comment: Revised survey includes expanded discussion section and reworked
introductory section on common deep architectures. Added missed papers from
before Feb 1st 201
Deep Learning in Cardiology
The medical field is creating large amount of data that physicians are unable
to decipher and use efficiently. Moreover, rule-based expert systems are
inefficient in solving complicated medical tasks or for creating insights using
big data. Deep learning has emerged as a more accurate and effective technology
in a wide range of medical problems such as diagnosis, prediction and
intervention. Deep learning is a representation learning method that consists
of layers that transform the data non-linearly, thus, revealing hierarchical
relationships and structures. In this review we survey deep learning
application papers that use structured data, signal and imaging modalities from
cardiology. We discuss the advantages and limitations of applying deep learning
in cardiology that also apply in medicine in general, while proposing certain
directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table
Multi-objective optimization determines when, which and how to fuse deep networks: an application to predict COVID-19 outcomes
The COVID-19 pandemic has caused millions of cases and deaths and the
AI-related scientific community, after being involved with detecting COVID-19
signs in medical images, has been now directing the efforts towards the
development of methods that can predict the progression of the disease. This
task is multimodal by its very nature and, recently, baseline results achieved
on the publicly available AIforCOVID dataset have shown that chest X-ray scans
and clinical information are useful to identify patients at risk of severe
outcomes. While deep learning has shown superior performance in several medical
fields, in most of the cases it considers unimodal data only. In this respect,
when, which and how to fuse the different modalities is an open challenge in
multimodal deep learning. To cope with these three questions here we present a
novel approach optimizing the setup of a multimodal end-to-end model. It
exploits Pareto multi-objective optimization working with a performance metric
and the diversity score of multiple candidate unimodal neural networks to be
fused. We test our method on the AIforCOVID dataset, attaining state-of-the-art
results, not only outperforming the baseline performance but also being robust
to external validation. Moreover, exploiting XAI algorithms we figure out a
hierarchy among the modalities and we extract the features' intra-modality
importance, enriching the trust on the predictions made by the model
Medical image retrieval for augmenting diagnostic radiology
Even though the use of medical imaging to diagnose patients is ubiquitous in clinical settings, their interpretations are still challenging for radiologists. Many factors make this interpretation task difficult, one of which is that medical images sometimes present subtle clues yet are crucial for diagnosis. Even worse, on the other hand, similar clues could indicate multiple diseases, making it challenging to figure out the definitive diagnoses. To help radiologists quickly and accurately interpret medical images, there is a need for a tool that can augment their diagnostic procedures and increase efficiency in their daily workflow. A general-purpose medical image retrieval system can be such a
tool as it allows them to search and retrieve similar cases that are already diagnosed to make comparative analyses that would complement their diagnostic decisions. In this thesis, we contribute to developing such a system by proposing approaches to be integrated as modules of a single system, enabling it to handle various information needs of radiologists and thus augment their diagnostic processes during the interpretation of medical images.
We have mainly studied the following retrieval approaches to handle radiologists’different information needs; i) Retrieval Based on Contents, ii) Retrieval Based on Contents, Patients’ Demographics, and Disease Predictions, and iii) Retrieval Based on Contents and Radiologists’ Text Descriptions. For the first study, we aimed to find an effective feature representation method to distinguish medical images considering their semantics and modalities. To do that, we have experimented different representation techniques based on handcrafted methods (mainly texture features) and deep learning (deep features). Based on the experimental results, we propose an effective feature representation approach and deep learning architectures for learning and extracting medical image contents. For the second study, we present a multi-faceted method that complements image contents with patients’ demographics and deep learning-based disease predictions, making it able to identify similar cases accurately considering the clinical context the radiologists seek.
For the last study, we propose a guided search method that integrates an image with a radiologist’s text description to guide the retrieval process. This method guarantees that the retrieved images are suitable for the comparative analysis to confirm or rule
out initial diagnoses (the differential diagnosis procedure). Furthermore, our method is based on a deep metric learning technique and is better than traditional content-based approaches that rely on only image features and, thus, sometimes retrieve insignificant random images
EGOFALLS: A visual-audio dataset and benchmark for fall detection using egocentric cameras
Falls are significant and often fatal for vulnerable populations such as the
elderly. Previous works have addressed the detection of falls by relying on
data capture by a single sensor, images or accelerometers. In this work, we
rely on multimodal descriptors extracted from videos captured by egocentric
cameras. Our proposed method includes a late decision fusion layer that builds
on top of the extracted descriptors. Furthermore, we collect a new dataset on
which we assess our proposed approach. We believe this is the first public
dataset of its kind. The dataset comprises 10,948 video samples by 14 subjects.
We conducted ablation experiments to assess the performance of individual
feature extractors, fusion of visual information, and fusion of both visual and
audio information. Moreover, we experimented with internal and external
cross-validation. Our results demonstrate that the fusion of audio and visual
information through late decision fusion improves detection performance, making
it a promising tool for fall prevention and mitigation
Multi-modal classifier fusion with feature cooperation for glaucoma diagnosis
Background: Glaucoma is a major public health problem that can lead to an optic nerve lesion, requiring systematic screening in the population over 45 years of age. The diagnosis and classification of this disease have had a marked and excellent development in recent years, particularly in the machine learning domain. Multimodal data have been shown to be a significant aid to the machine learning domain, especially by its contribution to improving data driven decision-making.
Method: Solving classification problems by combinations of classifiers has made it possible to increase the robustness as well as the classification reliability by using the complementarity that may exist between the classifiers. Complementarity is considered a key property of multimodality. A Convolutional Neural Network (CNN) works very well in pattern recognition and has been shown to exhibit superior performance, especially for image classification which can learn by themselves useful features from raw data. This article proposes a multimodal classification approach based on deep Convolutional Neural Network and Support Vector Machine (SVM) classifiers using multimodal data and multimodal feature for glaucoma diagnosis from retinal fundus images from RIM-ONE dataset. We make use of handcrafted feature descriptors such as the Gray Level Co-Occurrence Matrix, Central Moments and Hu Moments to co-operate with features automatically generated by the CNN in order to properly detect the optic nerve and consequently obtain a better classification rate, allowing a more reliable diagnosis of glaucoma.
Results: The experimental results confirm that the combination of classifiers using the BWWV technique is better than learning classifiers separately. The proposed method provides a computerized diagnosis system for glaucoma disease with impressive results comparing them to the main related studies that allow us to continue in this research path
- …