111 research outputs found

    Quantifying Health Inequalities Induced by Data and AI Models

    Get PDF
    AI technologies are being increasingly tested and applied in critical environments including healthcare. Without an effective way to detect and mitigate AI induced inequalities, AI might do more harm than good, potentially leading to the widening of underlying inequalities. This paper proposes a generic allocation-deterioration framework for detecting and quantifying AI induced inequality. Specifically, AI induced inequalities are quantified as the area between two allocation-deterioration curves. To assess the framework’s performance, experiments were conducted on ten synthetic datasets (N>33,000) generated from HiRID - a real-world Intensive Care Unit (ICU) dataset, showing its ability to accurately detect and quantify inequality proportionally to controlled inequalities. Extensive analyses were carried out to quantify health inequalities (a) embedded in two real-world ICU datasets; (b) induced by AI models trained for two resource allocation scenarios. Results showed that compared to men, women had up to 33% poorer deterioration in markers of prognosis when admitted to HiRID ICUs. All four AI models assessed were shown to induce significant inequalities (2.45% to 43.2%) for non-White compared to White patients. The models exacerbated data embedded inequalities significantly in 3 out of 8 assessments, one of which was >9 times worse

    Rare Disease Identification from Clinical Notes with Ontologies and Weak Supervision

    Get PDF
    The identification of rare diseases from clinical notes with Natural Language Processing (NLP) is challenging due to the few cases available for machine learning and the need of data annotation from clinical experts. We propose a method using ontologies and weak supervision. The approach includes two steps: (i) Text-to-UMLS, linking text mentions to concepts in Unified Medical Language System (UMLS), with a named entity linking tool (e.g. SemEHR) and weak supervision based on customised rules and Bidirectional Encoder Representations from Transformers (BERT) based contextual representations, and (ii) UMLS-to-ORDO, matching UMLS concepts to rare diseases in Orphanet Rare Disease Ontology (ORDO). Using MIMIC-III US intensive care discharge summaries as a case study, we show that the Text-to-UMLS process can be greatly improved with weak supervision, without any annotated data from domain experts. Our analysis shows that the overall pipeline processing discharge summaries can surface rare disease cases, which are mostly uncaptured in manual ICD codes of the hospital admissions.Comment: 5 pages, 3 figures, accepted for IEEE EMBC 202

    Artificial intelligence models for predicting cardiovascular diseases in people with type 2 diabetes: A systematic review

    Get PDF
    BACKGROUND: People with type 2 diabetes have a higher risk of cardiovascular disease morbidity and mortality. We aim to distil the evidence, summarize the developments, and identify the gaps in relevant research on predicting cardiovascular disease in type 2 diabetes people using AI techniques in the last ten years. METHODS: A systematic search was carried out for literature published between 1st January 2010 and 30th May 2021 in five medical and scientific databases, including Medline, EMBASE, Global Health (CABI), IEEE Xplore and Web of Science Core Collection. All English language studies describing AI models for predicting cardiovascular diseases in adults with type 2 diabetes were included. The retrieved studies were screened and the data from included studies were extracted by two reviewers. The survey and synthesis of extracted data were conducted based on predefined research questions. IJMEDI checklist was used for quality assessment. RESULTS: From 176 articles identified by the search, 5 studies with sample sizes ranging from 560 to 203,517 met our inclusion criteria. The models predicted the risk of multiple cardiovascular diseases over 5 or 10 years. Ensemble learning, particularly random forest, is the most used algorithm in these models and consistently provided competitive performance. Commonly used features include age, body mass index, blood pressure measurements, and cholesterol measurements. Only one study carried out external validation. The area under the receiver operating characteristic curve for derivation cohorts varied from 0.69 to 0.77. AI models achieved better performance than conventional models in some specific scenarios. CONCLUSIONS: AI technologies seem to show promising performance (AUROC in external validation: 0.75 compared to 0.69 from conventional risk scores) for cardiovascular disease prediction in type 2 diabetes people. However, only one of the reviewed models conducted an external validation. Quality of reporting was low in general, and all models lack reproducibility and reusability

    The psycho-ENV corpus:Research articles annotated for knowledge discovery on correlating mental diseases and environmental factors

    Get PDF
    While the published scientific literature is used in a biomedical context such as building gene networks for disease gene discovery, it seems to be an undervalued resource with respect to mental illnesses. It has been rarely explored for the purpose of gaining psychopathology insights. This limits our capability of better understanding the underlying mechanisms of mental disorders. In this paper we describe the psycho-env corpus, which aims at annotating published studies for facilitating knowledge discovery on pathologies of mental diseases. Specifically, this corpus focuses on the correlations between mental diseases and environmental factors. We report the first preliminary work of psycho-env on annotating 20 articles about two mental illnesses (bipolar disorder and depression) and two particular environmental factors - light and sunlight. The corpus is available at https://github.com/KHP-Informatics/psycho-env

    Ontology-driven and weakly supervised rare disease identification from clinical notes

    Get PDF
    BACKGROUND: Computational text phenotyping is the practice of identifying patients with certain disorders and traits from clinical notes. Rare diseases are challenging to be identified due to few cases available for machine learning and the need for data annotation from domain experts. METHODS: We propose a method using ontologies and weak supervision, with recent pre-trained contextual representations from Bi-directional Transformers (e.g. BERT). The ontology-driven framework includes two steps: (i) Text-to-UMLS, extracting phenotypes by contextually linking mentions to concepts in Unified Medical Language System (UMLS), with a Named Entity Recognition and Linking (NER+L) tool, SemEHR, and weak supervision with customised rules and contextual mention representation; (ii) UMLS-to-ORDO, matching UMLS concepts to rare diseases in Orphanet Rare Disease Ontology (ORDO). The weakly supervised approach is proposed to learn a phenotype confirmation model to improve Text-to-UMLS linking, without annotated data from domain experts. We evaluated the approach on three clinical datasets, MIMIC-III discharge summaries, MIMIC-III radiology reports, and NHS Tayside brain imaging reports from two institutions in the US and the UK, with annotations. RESULTS: The improvements in the precision were pronounced (by over 30% to 50% absolute score for Text-to-UMLS linking), with almost no loss of recall compared to the existing NER+L tool, SemEHR. Results on radiology reports from MIMIC-III and NHS Tayside were consistent with the discharge summaries. The overall pipeline processing clinical notes can extract rare disease cases, mostly uncaptured in structured data (manually assigned ICD codes). CONCLUSION: The study provides empirical evidence for the task by applying a weakly supervised NLP pipeline on clinical notes. The proposed weak supervised deep learning approach requires no human annotation except for validation and testing, by leveraging ontologies, NER+L tools, and contextual representations. The study also demonstrates that Natural Language Processing (NLP) can complement traditional ICD-based approaches to better estimate rare diseases in clinical notes. We discuss the usefulness and limitations of the weak supervision approach and propose directions for future studies

    Knowledge Driven Phenotyping

    Get PDF
    Extracting patient phenotypes from routinely collected health data (such as Electronic Health Records) requires translating clinically-sound phenotype definitions into queries/computations executable on the underlying data sources by clinical researchers. This requires significant knowledge and skills to deal with heterogeneous and often imperfect data. Translations are time-consuming, error-prone and, most importantly, hard to share and reproduce across different settings. This paper proposes a knowledge driven framework that (1) decouples the specification of phenotype semantics from underlying data sources; (2) can automatically populate and conduct phenotype computations on heterogeneous data spaces. We report preliminary results of deploying this framework on five Scottish health datasets

    Towards automated dermatology triage: deep learning and knowledge-driven approaches

    Get PDF
    Background The current triage process in the National Health Service (NHS) requires secondary care clinicians to manually read every General Practitioner’s (GP) referral letter, which makes the process time-consuming with associated high costs. Artificial Intelligence (AI) algorithms can be adopted to accelerate this process and reduce the required resources. Objectives To design AI models that can automatically stratify GP referrals to routine and non-routine categories, and to evaluate different AI algorithms against the current manual triage process. Methods We developed and evaluated multiple AI models to triage dermatology referrals into binary outcomes, i.e., routine or non-routine. The models ranged from a totally data-driven (deep learning) approach to different levels of knowledge-enriched approaches: 1) a transfer learning approach using a pre-trained large language model; 2) a deep learning model using Long Short-Term Memory architecture, enriched with key concepts from referral guidelines; and 3) a knowledge-driven model utilising the semantics of key concepts from clinical guidelines and customised clinicians’ dictionaries. Random oversampling and data augmentation were used for dealing with highly imbalanced triage classes. All referrals were individually triaged by two dermatologists and then compared against the results generated from AI-assisted triage models. Performances were evaluated using Precision-Recall Area Under Curve (PR-AUC) and Receiver Operating Characteristic Area Under Curve (ROC-AUC). Results 268 GP referrals to adult dermatology services were included. The knowledge-driven approach achieved the best performance (micro average PR-AUC of 0.907±0.006, ROC-AUC of 0.720 ± 0.010) compared to the baseline end-to-end deep learning model (micro average PR-AUC of 0.823±0.038, ROC-AUC of 0.616 ± 0.096) and the Long Short-Term Memory model (0.867±0.013, 0.600 ± 0.071). Imbalance preprocessing methods improved the model performance in some cases but not to a significant level. Combining all types of domain knowledge in AI models outperformed any subsets of these knowledge inputs. Conclusions The knowledge-enhanced AI approach showed promising results in achieving triage outcomes comparable to manual outcomes despite the limited data input from the referrals. AI-assisted triage has the potential to make the triaging process less time-consuming and more cost-effective, whilst retaining accuracy

    Benchmarking and analyzing in-context learning, fine-tuning and supervised learning for biomedical knowledge curation: a focused study on chemical entities of biological interest

    Get PDF
    Automated knowledge curation for biomedical ontologies is key to ensure that they remain comprehensive, high-quality and up-to-date. In the era of foundational language models, this study compares and analyzes three NLP paradigms for curation tasks: in-context learning (ICL), fine-tuning (FT), and supervised learning (ML). Using the Chemical Entities of Biological Interest (ChEBI) database as a model ontology, three curation tasks were devised. For ICL, three prompting strategies were employed with GPT-4, GPT-3.5, BioGPT. PubmedBERT was chosen for the FT paradigm. For ML, six embedding models were utilized for training Random Forest and Long-Short Term Memory models. Five setups were designed to assess ML and FT model performance across different data availability scenarios.Datasets for curation tasks included: task 1 (620,386), task 2 (611,430), and task 3 (617,381), maintaining a 50:50 positive versus negative ratio. For ICL models, GPT-4 achieved best accuracy scores of 0.916, 0.766 and 0.874 for tasks 1-3 respectively. In a direct comparison, ML (trained on ~260,000 triples) outperformed ICL in accuracy across all tasks. (accuracy differences: +.11, +.22 and +.17). Fine-tuned PubmedBERT performed similarly to leading ML models in tasks 1 & 2 (F1 differences: -.014 and +.002), but worse in task 3 (-.048). Simulations revealed performance declines in both ML and FT models with smaller and higher imbalanced training data. where ICL (particularly GPT-4) excelled in tasks 1 & 3. GPT-4 excelled in tasks 1 and 3 with less than 6,000 triples, surpassing ML/FT. ICL underperformed ML/FT in task 2.ICL-augmented foundation models can be good assistants for knowledge curation with correct prompting, however, not making ML and FT paradigms obsolete. The latter two require task-specific data to beat ICL. In such cases, ML relies on small pretrained embeddings, minimizing computational demands
    • 

    corecore