7 research outputs found
A Study of Age and Sex Bias in Multiple Instance Learning based Classification of Acute Myeloid Leukemia Subtypes
Accurate classification of Acute Myeloid Leukemia (AML) subtypes is crucial
for clinical decision-making and patient care. In this study, we investigate
the potential presence of age and sex bias in AML subtype classification using
Multiple Instance Learning (MIL) architectures. To that end, we train multiple
MIL models using different levels of sex imbalance in the training set and
excluding certain age groups. To assess the sex bias, we evaluate the
performance of the models on male and female test sets. For age bias, models
are tested against underrepresented age groups in the training data. We find a
significant effect of sex and age bias on the performance of the model for AML
subtype classification. Specifically, we observe that females are more likely
to be affected by sex imbalance dataset and certain age groups, such as
patients with 72 to 86 years of age with the RUNX1::RUNX1T1 genetic subtype,
are significantly affected by an age bias present in the training data.
Ensuring inclusivity in the training data is thus essential for generating
reliable and equitable outcomes in AML genetic subtype classification,
ultimately benefiting diverse patient populations.Comment: Accepted for publication at workshop on Fairness of AI in Medical
Imaging in International Conference on Medical Image Computing and Computer
Assisted Intervention (MICCAI 2023
RedTell: an AI tool for interpretable analysis of red blood cell morphology
Introduction: Hematologists analyze microscopic images of red blood cells to study their morphology and functionality, detect disorders and search for drugs. However, accurate analysis of a large number of red blood cells needs automated computational approaches that rely on annotated datasets, expensive computational resources, and computer science expertise. We introduce RedTell, an AI tool for the interpretable analysis of red blood cell morphology comprising four single-cell modules: segmentation, feature extraction, assistance in data annotation, and classification.Methods: Cell segmentation is performed by a trained Mask R-CNN working robustly on a wide range of datasets requiring no or minimum fine-tuning. Over 130 features that are regularly used in research are extracted for every detected red blood cell. If required, users can train task-specific, highly accurate decision tree-based classifiers to categorize cells, requiring a minimal number of annotations and providing interpretable feature importance.Results: We demonstrate RedTell’s applicability and power in three case studies. In the first case study we analyze the difference of the extracted features between the cells coming from patients suffering from different diseases, in the second study we use RedTell to analyze the control samples and use the extracted features to classify cells into echinocytes, discocytes and stomatocytes and finally in the last use case we distinguish sickle cells in sickle cell disease patients.Discussion: We believe that RedTell can accelerate and standardize red blood cell research and help gain new insights into mechanisms, diagnosis, and treatment of red blood cell associated disorders
A Continual Learning Approach for Cross-Domain White Blood Cell Classification
Accurate classification of white blood cells in peripheral blood is essential
for diagnosing hematological diseases. Due to constantly evolving clinical
settings, data sources, and disease classifications, it is necessary to update
machine learning classification models regularly for practical real-world use.
Such models significantly benefit from sequentially learning from incoming data
streams without forgetting previously acquired knowledge. However, models can
suffer from catastrophic forgetting, causing a drop in performance on previous
tasks when fine-tuned on new data. Here, we propose a rehearsal-based continual
learning approach for class incremental and domain incremental scenarios in
white blood cell classification. To choose representative samples from previous
tasks, we employ exemplar set selection based on the model's predictions. This
involves selecting the most confident samples and the most challenging samples
identified through uncertainty estimation of the model. We thoroughly evaluated
our proposed approach on three white blood cell classification datasets that
differ in color, resolution, and class composition, including scenarios where
new domains or new classes are introduced to the model with every task. We also
test a long class incremental experiment with both new domains and new classes.
Our results demonstrate that our approach outperforms established baselines in
continual learning, including existing iCaRL and EWC methods for classifying
white blood cells in cross-domain environments.Comment: Accepted for publication at workshop on Domain Adaptation and
Representation Transfer (DART) in International Conference on Medical Image
Computing and Computer Assisted Intervention (MICCAI 2023
Anomaly-aware multiple instance learning for rare anemia disorder classification
Deep learning-based classification of rare anemia disorders is challenged by
the lack of training data and instance-level annotations. Multiple Instance
Learning (MIL) has shown to be an effective solution, yet it suffers from low
accuracy and limited explainability. Although the inclusion of attention
mechanisms has addressed these issues, their effectiveness highly depends on
the amount and diversity of cells in the training samples. Consequently, the
poor machine learning performance on rare anemia disorder classification from
blood samples remains unresolved. In this paper, we propose an interpretable
pooling method for MIL to address these limitations. By benefiting from
instance-level information of negative bags (i.e., homogeneous benign cells
from healthy individuals), our approach increases the contribution of anomalous
instances. We show that our strategy outperforms standard MIL classification
algorithms and provides a meaningful explanation behind its decisions.
Moreover, it can denote anomalous instances of rare blood diseases that are not
seen during the training phase
Multiple Mice Tracking: Occlusions Disentanglement using a Gaussian Mixture Model
Mouse models play an important role in preclinical research and drug discovery for human diseases. The fact that mice are a social species partaking in social interactions of high degree facilitates the study of diseases characterized by social alterations. Hence, robust animal tracking is of great importance in order to build tools capable of automatically analyzing social behavioral interactions of multiple mice. However, the presence of occlusions is a major problem in multiple mice tracking. To deal with this problem, we present here a tracking algorithm based on Kalman filter and Gaussian Mixture Modeling. Specifically, Kalman tracking is used to track the mice and when occlusions happen, we fit 2D Gaussian distributions to separate mouse blobs. This helps us to prevent mice identity swaps as it is an important feature for accurate behavior analysis. As the results of our experiments show, the proposed algorithm results in much fewer identity swaps than other state of the art algorithms
Explainable AI identifies diagnostic cells of genetic AML subtypes
Explainable AI is deemed essential for clinical applications as it allows rationalizing model predictions, helping to build trust between clinicians and automated decision support tools. We developed an inherently explainable AI model for the classification of acute myeloid leukemia subtypes from blood smears and found that high-attention cells identified by the model coincide with those labeled as diagnostically relevant by human experts. Based on over 80,000 single white blood cell images from digitized blood smears of 129 patients diagnosed with one of four WHO-defined genetic AML subtypes and 60 healthy controls, we trained SCEMILA, a single-cell based explainable multiple instance learning algorithm. SCEMILA could perfectly discriminate between AML patients and healthy controls and detected the APL subtype with an F1 score of 0.86±0.05 (mean±s.d., 5-fold cross-validation). Analyzing a novel multi-attention module, we confirmed that our algorithm focused with high concordance on the same AML-specific cells as human experts do. Applied to classify single cells, it is able to highlight subtype specific cells and deconvolve the composition of a patient’s blood smear without the need of single-cell annotation of the training data. Our large AML genetic subtype dataset is publicly available, and an interactive online tool facilitates the exploration of data and predictions. SCEMILA enables a comparison of algorithmic and expert decision criteria and can present a detailed analysis of individual patient data, paving the way to deploy AI in the routine diagnostics for identifying hematopoietic neoplasms. Author summary The analysis of blood and bone marrow smear microscopy by trained human experts remains an essential cornerstone of the diagnostic workup for severe blood diseases, like acute myeloid leukemia. While this step yields insight into a patient’s blood system composition, it is also tedious, time consuming and not standardized. Here, we present SCEMILA, an algorithm trained to distinguish blood smears from healthy stem cell donors and four different types of acute myeloid leukemia. Our algorithm is able to classify a patient’s blood sample based on roughly 400 single cell images, and can highlight cells most relevant to the algorithm. This allows us to cross-check the algorithm’s decision making with human expertise. We show that SCEMILA is able to identify relevant cells for acute myeloid leukemia, and therefore believe that it will contribute towards a future, where machine learning algorithms and human experts collaborate to form a synergy for high-performance blood cancer diagnosis
Donor age and red cell age contribute to the variance in lorrca indices in healthy donors for next generation ektacytometry: a pilot study
The ability of red blood cells (RBCs) to transport gases, their lifespan as well as their rheological properties invariably depend on the deformability, hydration, and membrane stability of these cells, which can be measured by Laser optical rotational red cell analyser (Lorrca® Maxsis, RR Mechatronics). The osmoscan mode of Lorrca is currently used in diagnosis of rare anemias in clinical laboratories. However, a broad range of normal values for healthy subjects reduces the sensitivity of this method for diagnosis of mild disease phenotype. In this pilot study, we explored the impact of age and gender of 45 healthy donors, as well as RBC age on the Lorrca indices. Whereas gender did not affect the Lorrca indices in our study, the age donors had a profound effect on the O_hyper parameter. To study the impact of RBC age on the osmoscan parameters, we have isolated low (L)-, medium (M)-, or high (H)- density fractions enriched with young, mature, and senescent RBCs, respectively, and evaluated the influence of RBC age-related properties, such as density, morphology, and redox state, on the osmoscan indices. As before, O_hyper was the most sensitive parameter, dropping markedly with an increase in RBC density and age. Senescence was associated with a decrease in deformability (EI_max) and tolerability to low and high osmolatites (Area). L-fraction was enriched with reticulocytes and cells with high projected area and EMA staining, but also contained a small number of cells small in projected area and most likely, terminally senescent. L-fraction was on average slightly less deformable than mature cells. The cells from the L-fraction produced more oxidants and NO than all other fractions. However, RBCs from the L-fraction contained maximal levels of reduced thiols compared to other fractions. Our study suggests that reference values for O_hyper should be age-stratified, and, most probably, corrected for the average RBC age. Further multi-center study is required to validate these suggestions before implementing them into clinical practice