9 research outputs found

    A Weakly Supervised Deep Learning Approach for Detecting Malaria and Sickle Cells in Blood Films

    Get PDF
    Machine vision analysis of blood films imaged under a brightfield microscope could provide scalable malaria diagnosis solutions in resource constrained endemic urban settings. The major bottleneck in successfully analyzing blood films with deep learning vision techniques is a lack of object-level annotations of disease markers such as parasites or abnormal red blood cells. To overcome this challenge, this work proposes a novel deep learning supervised approach that leverages weak labels readily available from routine clinical microscopy to diagnose malaria in thick blood film microscopy. This approach is based on aggregating the convolutional features of multiple objects present in one hundred high resolution image fields. We show that this method not only achieves expert-level malaria diagnostic accuracy without any hard object-level labels but can also identify individual malaria parasites in digitized thick blood films, which is useful in assessing disease severity and response to treatment. We demonstrate another application scenario where our approach is able to detect sickle cells in thin blood films. We discuss the wider applicability of the approach in automated analysis of thick blood films for the diagnosis of other blood disorders

    RedTell: an AI tool for interpretable analysis of red blood cell morphology

    Full text link
    Introduction: Hematologists analyze microscopic images of red blood cells to study their morphology and functionality, detect disorders and search for drugs. However, accurate analysis of a large number of red blood cells needs automated computational approaches that rely on annotated datasets, expensive computational resources, and computer science expertise. We introduce RedTell, an AI tool for the interpretable analysis of red blood cell morphology comprising four single-cell modules: segmentation, feature extraction, assistance in data annotation, and classification.Methods: Cell segmentation is performed by a trained Mask R-CNN working robustly on a wide range of datasets requiring no or minimum fine-tuning. Over 130 features that are regularly used in research are extracted for every detected red blood cell. If required, users can train task-specific, highly accurate decision tree-based classifiers to categorize cells, requiring a minimal number of annotations and providing interpretable feature importance.Results: We demonstrate RedTell’s applicability and power in three case studies. In the first case study we analyze the difference of the extracted features between the cells coming from patients suffering from different diseases, in the second study we use RedTell to analyze the control samples and use the extracted features to classify cells into echinocytes, discocytes and stomatocytes and finally in the last use case we distinguish sickle cells in sickle cell disease patients.Discussion: We believe that RedTell can accelerate and standardize red blood cell research and help gain new insights into mechanisms, diagnosis, and treatment of red blood cell associated disorders

    Uncertainty-Aware AI for ECG arrhythmia multi-label classification

    Get PDF
    Machine Learning (ML) models are able to predict a variety of diseases, with performances that can be superior to those achieved by healthcare professionals. However, when implemented in clinical settings as decision support systems, their generalisation capabilities are often compromised, rendering healthcare professionals more susceptible into delivering erroneous diagnostics. This research focuses on uncertainty measures as a key method to abstain from classifying samples with high uncertainty as well as a selection criterion for active learning strategies. For this purpose, it was employed four large public multi-label Electrocardiogram (ECG) databases for the classification of cardiac arrhythmias. Regarding the uncertainty measures, single distribution uncertainty and classical information-theoretic measures of entropy were tested and compared. Thus, three Deep Learning models were developed: a single convolutional neural network and two multiple-models using Monte-Carlo Dropout and Deep Ensemble techniques. When tested with samples from the same database used for training, all models achieved performances higher than 95% for F1-score. However, when tested on an external dataset, their performances dropped to approximately 70%, indicating a probable scenario of dataset shift. The Deep Ensemble model obtained the highest F1-score in both test sets with a maximum difference of 3% from the others. The classification withrejection option increased from a rejection of10% to a range between 30% to 50% depending on the model or uncertainty measure, with the highest rejection rates being obtained on external data. This reveals that external dataset’s classifications have higher uncertainty, also an indication of dataset shift. For the active learning approach, 10% of the highest uncertainty sampleswere used to retrain the models. The performances results increased by almost 5%, suggesting uncertainty as a good selection method. Although there are still challenges to the implementation of ML models, the preliminary studies show that uncertainty quantification is a valuable method for classification with rejection option and active learning approaches under dataset shift conditions.Modelos de aprendizagem automática conseguem prever um leque de doenças, muitas vezes com desempenhos superiores aos obtidos pelos profissionais de saúde. Contudo, quando integrados em ambientes clínicos como sistemas de apoio à decisão, a generalização destes fica comprometida, o que leva a que profissionais de saúde fiquem mais suscetíveis de fornecer diagnósticos incorretos. Deste modo, este projeto foca-se no papel da incerteza na rejeição de classificações com elevada incerteza e na aprendizagem ativa. Quatro bases de dados públicas de sinais ECG multi-label foram utilizadas na classificação de arritmias cardíacas. Relativamente à quantificação da incerteza, foram testadas e comparadas incertezas provenientes das distribuições e da teoria de informação clássica da entropia. Para tal, foram desenvolvidos três tipos de redes neurais convolucionais: um modelo único e dois modelos obtidos através das técnicas de Monte-Carlo Dropout e Deep Ensemble. Quando testados com dados da mesma base de dados de treino, os modelos alcançaram desempenhos superiores a 95% de F1-score. No entanto, quando testados com dados externos, os desempenhos desceram para cerca de 70%, revelando a possibilidade de dataset shift. O modelo Deep Ensemble obteve os melhores resultados em ambos os dados de teste, com uma diferença máxima de 3% em relação aos outros modelos. O threshold de rejeição de 10% em treino aumentou para valores entre 30% a 50%, dependendo do modelo e da medida de incerteza, sendo que as rejeições mais elevadas são obtidas nos dados externos. Isto revela que estes dados têm maior incerteza nas suas classificações, confirmando a presença de dataset shift. Para a abordagem de aprendizagem ativa, 10% de dados com elevada incerteza foram utilizados para retreinar os modelos. O desempenho destes aumentou quase 5%, sugerindo a incerteza como um bom critério de seleção. Apesar de ainda existirem desafios na implementação de modelos de aprendizagem automática, os resultados preliminares revelam que a quantificação da incerteza é um método valioso na classificação com rejeição e na aprendizagem ativa, em condições de dataset shift

    Computer aided classification of histopathological damage in images of haematoxylin and eosin stained human skin

    Get PDF
    EngD ThesisExcised human skin can be used as a model to assess the potency, immunogenicity and contact sensitivity of potential therapeutics or cosmetics via the assessment of histological damage. The current method of assessing the damage uses traditional manual histological assessment, which is inherently subjective, time consuming and prone to intra-observer variability. Computer aided analysis has the potential to address issues surrounding traditional histological techniques through the application of quantitative analysis. This thesis describes the development of a computer aided process to assess the immune-mediated structural breakdown of human skin tissue. Research presented includes assessment and optimisation of image acquisition methodologies, development of an image processing and segmentation algorithm, identification and extraction of a novel set of descriptive image features and the evaluation of a selected subset of these features in a classification model. A new segmentation method is presented to identify epidermis tissue from skin with varying degrees of histopathological damage. Combining enhanced colour information with general image intensity information, the fully automated methodology segments the epidermis with a mean specificity of 97.7%, a mean sensitivity of 89.4% and a mean accuracy of 96.5% and segments effectively for different severities of tissue damage. A set of 140 feature measurements containing information about the tissue changes associated with different grades of histopathological skin damage were identified and a wrapper algorithm employed to select a subset of the extracted features, evaluating feature subsets based their prediction error for an independent test set in a Naïve Bayes Classifier. The final classification algorithm classified a 169 image set with an accuracy of 94.1%, of these images 20 were an unseen validation set for which the accuracy was 85.0%. The final classification method has a comparable accuracy to the existing manual method, improved repeatability and reproducibility and does not require an experienced histopathologist

    Automated analysis of colorectal cancer

    Get PDF
    Colorectal cancer (CRC) is the second largest cause of cancer deaths in the UK, with approximately 16,000 per year. Over 41,000 people are diagnosed annually, and 43% of those will die within ten years of diagnosis. The treatment of CRC patients relies on pathological examination of the disease to identify visual features that predict growth and spread, and response to chemoradiotherapy. These prognostic features are identified manually, and are subject to inter and intra-scorer variability. This variability stems from the subjectivity in interpreting large images which can have very varied appearances, as well as the time consuming and laborious methodology of visually inspecting cancer cells. The work in this thesis presents a systematic approach to developing a solution to address this problem for one such prognostic indicator, the Tumour:Stroma Ratio (TSR). The steps taken are presented sequentially through the chapters, in order of the work carried out. These specifically involve the acquisition and assessment of a dataset of 2.4 million expert-classified images of CRC, and multiple iterations of algorithm development, to automate the process of generating TSRs for patient cases. The algorithm improvements are made using conclusions from observer studies, conducted on a psychophysics experiment platform developed as part of this work, and further work is undertaken to identify issues of image quality that affect automated solutions. The developed algorithm is then applied to a clinical trial dataset with survival data, meaning that the algorithm is validated against two separate pathologist-scored, clinical trial datasets, as well as being able to test its suitability for generating independent prognostic markers

    Methods and instrumentation for raman characterization of bladder cancer tumor

    Get PDF
    High incidence and recurrence rates make bladder cancer the most common malignant tumor in the urinary system. Cystoscopy is the gold standard test used for diagnosis, nevertheless small flat tumors might be missed, and the procedure still represents discomfort to patients and high recurrence can result from of urethral injuries. During cystoscopy, suspicious tumors are detected through white light endoscopy and resected tissue is further examined by histopathology. after resection, the pathologist provides information on the differentiation of the cells and the penetration depth of the tumor in the tissue, known as grading and staging of tumor, respectively. During cystoscopy, information on tumor grading and morphological depth characterization can assist onsite diagnosis and significantly reduce the amount of unnecessarily resected tissue. Recently, new developments in optical imaging and spectroscopic approaches have been demonstrated to improve the results of standard techniques by providing real-time detection of macroscopic and microscopic biomedical information. Different applications to detect anomalies in tissues and cells based on the chemical composition and structure at the microscopic level have been successfully tested. There is, nevertheless, the need to cope with the demands for clinical translation. This doctoral thesis presents the investigations, clinical studies and approaches applied to filling the main open research questions when applying Raman spectroscopy as a diagnostic tool for bladder cancer tumor grading and general Raman spectroscopy-based oncological clinical studies

    Multiclass deep active learning for detecting red blood cell subtypes in brightfield microscopy.

    No full text
    The recent success of deep learning approaches relies partly on large amounts of well annotated training data. For natural images object annotation is easy and cheap. For biomedical images however, annotation crucially depends on the availability of a trained expert whose time is typically expensive and scarce. To ensure efficient annotation, only the most relevant objects should be presented to the expert. Currently, no approach exists that allows to select those for a multiclass detection problem. Here, we present an active learning framework that identifies the most relevant samples from a large set of not annotated data for further expert annotation. Applied to brightfield images of red blood cells with seven subtypes, we train a faster R-CNN for single cell identification and classification, calculate a novel confidence score using dropout variational inference and select relevant images for annotation based on (i) the confidence of the single cell detection and (ii) the rareness of the classes contained in the image. We show that our approach leads to a drastic increase of prediction accuracy with already few annotated images. Our original approach improves classification of red blood cell subtypes and speeds up the annotation. This important step in diagnosing blood diseases will profit from our framework as well as many other clinical challenges that suffer from the lack of annotated training data
    corecore