111 research outputs found
Quantifying Health Inequalities Induced by Data and AI Models
AI technologies are being increasingly tested and
applied in critical environments including healthcare. Without an effective way to detect and mitigate AI induced inequalities, AI might do more
harm than good, potentially leading to the widening of underlying inequalities. This paper proposes
a generic allocation-deterioration framework for
detecting and quantifying AI induced inequality.
Specifically, AI induced inequalities are quantified
as the area between two allocation-deterioration
curves. To assess the frameworkâs performance, experiments were conducted on ten synthetic datasets
(N>33,000) generated from HiRID - a real-world
Intensive Care Unit (ICU) dataset, showing its ability to accurately detect and quantify inequality proportionally to controlled inequalities. Extensive
analyses were carried out to quantify health inequalities (a) embedded in two real-world ICU
datasets; (b) induced by AI models trained for two
resource allocation scenarios. Results showed that
compared to men, women had up to 33% poorer deterioration in markers of prognosis when admitted
to HiRID ICUs. All four AI models assessed were
shown to induce significant inequalities (2.45% to
43.2%) for non-White compared to White patients.
The models exacerbated data embedded inequalities significantly in 3 out of 8 assessments, one of
which was >9 times worse
Rare Disease Identification from Clinical Notes with Ontologies and Weak Supervision
The identification of rare diseases from clinical notes with Natural Language
Processing (NLP) is challenging due to the few cases available for machine
learning and the need of data annotation from clinical experts. We propose a
method using ontologies and weak supervision. The approach includes two steps:
(i) Text-to-UMLS, linking text mentions to concepts in Unified Medical Language
System (UMLS), with a named entity linking tool (e.g. SemEHR) and weak
supervision based on customised rules and Bidirectional Encoder Representations
from Transformers (BERT) based contextual representations, and (ii)
UMLS-to-ORDO, matching UMLS concepts to rare diseases in Orphanet Rare Disease
Ontology (ORDO). Using MIMIC-III US intensive care discharge summaries as a
case study, we show that the Text-to-UMLS process can be greatly improved with
weak supervision, without any annotated data from domain experts. Our analysis
shows that the overall pipeline processing discharge summaries can surface rare
disease cases, which are mostly uncaptured in manual ICD codes of the hospital
admissions.Comment: 5 pages, 3 figures, accepted for IEEE EMBC 202
Artificial intelligence models for predicting cardiovascular diseases in people with type 2 diabetes: A systematic review
BACKGROUND: People with type 2 diabetes have a higher risk of cardiovascular disease morbidity and mortality. We aim to distil the evidence, summarize the developments, and identify the gaps in relevant research on predicting cardiovascular disease in type 2 diabetes people using AI techniques in the last ten years. METHODS: A systematic search was carried out for literature published between 1st January 2010 and 30th May 2021 in five medical and scientific databases, including Medline, EMBASE, Global Health (CABI), IEEE Xplore and Web of Science Core Collection. All English language studies describing AI models for predicting cardiovascular diseases in adults with type 2 diabetes were included. The retrieved studies were screened and the data from included studies were extracted by two reviewers. The survey and synthesis of extracted data were conducted based on predefined research questions. IJMEDI checklist was used for quality assessment. RESULTS: From 176 articles identified by the search, 5 studies with sample sizes ranging from 560 to 203,517 met our inclusion criteria. The models predicted the risk of multiple cardiovascular diseases over 5 or 10 years. Ensemble learning, particularly random forest, is the most used algorithm in these models and consistently provided competitive performance. Commonly used features include age, body mass index, blood pressure measurements, and cholesterol measurements. Only one study carried out external validation. The area under the receiver operating characteristic curve for derivation cohorts varied from 0.69 to 0.77. AI models achieved better performance than conventional models in some specific scenarios. CONCLUSIONS: AI technologies seem to show promising performance (AUROC in external validation: 0.75 compared to 0.69 from conventional risk scores) for cardiovascular disease prediction in type 2 diabetes people. However, only one of the reviewed models conducted an external validation. Quality of reporting was low in general, and all models lack reproducibility and reusability
The psycho-ENV corpus:Research articles annotated for knowledge discovery on correlating mental diseases and environmental factors
While the published scientific literature is used in a biomedical context such as building gene networks for disease gene discovery, it seems to be an undervalued resource with respect to mental illnesses. It has been rarely explored for the purpose of gaining psychopathology insights. This limits our capability of better understanding the underlying mechanisms of mental disorders. In this paper we describe the psycho-env corpus, which aims at annotating published studies for facilitating knowledge discovery on pathologies of mental diseases. Specifically, this corpus focuses on the correlations between mental diseases and environmental factors. We report the first preliminary work of psycho-env on annotating 20 articles about two mental illnesses (bipolar disorder and depression) and two particular environmental factors - light and sunlight. The corpus is available at https://github.com/KHP-Informatics/psycho-env
Ontology-driven and weakly supervised rare disease identification from clinical notes
BACKGROUND: Computational text phenotyping is the practice of identifying patients with certain disorders and traits from clinical notes. Rare diseases are challenging to be identified due to few cases available for machine learning and the need for data annotation from domain experts. METHODS: We propose a method using ontologies and weak supervision, with recent pre-trained contextual representations from Bi-directional Transformers (e.g. BERT). The ontology-driven framework includes two steps: (i) Text-to-UMLS, extracting phenotypes by contextually linking mentions to concepts in Unified Medical Language System (UMLS), with a Named Entity Recognition and Linking (NER+L) tool, SemEHR, and weak supervision with customised rules and contextual mention representation; (ii) UMLS-to-ORDO, matching UMLS concepts to rare diseases in Orphanet Rare Disease Ontology (ORDO). The weakly supervised approach is proposed to learn a phenotype confirmation model to improve Text-to-UMLS linking, without annotated data from domain experts. We evaluated the approach on three clinical datasets, MIMIC-III discharge summaries, MIMIC-III radiology reports, and NHS Tayside brain imaging reports from two institutions in the US and the UK, with annotations. RESULTS: The improvements in the precision were pronounced (by over 30% to 50% absolute score for Text-to-UMLS linking), with almost no loss of recall compared to the existing NER+L tool, SemEHR. Results on radiology reports from MIMIC-III and NHS Tayside were consistent with the discharge summaries. The overall pipeline processing clinical notes can extract rare disease cases, mostly uncaptured in structured data (manually assigned ICD codes). CONCLUSION: The study provides empirical evidence for the task by applying a weakly supervised NLP pipeline on clinical notes. The proposed weak supervised deep learning approach requires no human annotation except for validation and testing, by leveraging ontologies, NER+L tools, and contextual representations. The study also demonstrates that Natural Language Processing (NLP) can complement traditional ICD-based approaches to better estimate rare diseases in clinical notes. We discuss the usefulness and limitations of the weak supervision approach and propose directions for future studies
Knowledge Driven Phenotyping
Extracting patient phenotypes from routinely collected health data (such as Electronic Health Records) requires translating clinically-sound phenotype definitions into queries/computations executable on the underlying data sources by clinical researchers. This requires significant knowledge and skills to deal with heterogeneous and often imperfect data. Translations are time-consuming, error-prone and, most importantly, hard to share and reproduce across different settings. This paper proposes a knowledge driven framework that (1) decouples the specification of phenotype semantics from underlying data sources; (2) can automatically populate and conduct phenotype computations on heterogeneous data spaces. We report preliminary results of deploying this framework on five Scottish health datasets
Towards automated dermatology triage: deep learning and knowledge-driven approaches
Background
The current triage process in the National Health Service (NHS) requires secondary care clinicians to manually read every General Practitionerâs (GP) referral letter, which makes the process time-consuming with associated high costs. Artificial Intelligence (AI) algorithms can be adopted to accelerate this process and reduce the required resources.
Objectives
To design AI models that can automatically stratify GP referrals to routine and non-routine categories, and to evaluate different AI algorithms against the current manual triage process.
Methods
We developed and evaluated multiple AI models to triage dermatology referrals into binary outcomes, i.e., routine or non-routine. The models ranged from a totally data-driven (deep learning) approach to different levels of knowledge-enriched approaches: 1) a transfer learning approach using a pre-trained large language model; 2) a deep learning model using Long Short-Term Memory architecture, enriched with key concepts from referral guidelines; and 3) a knowledge-driven model utilising the semantics of key concepts from clinical guidelines and customised cliniciansâ dictionaries. Random oversampling and data augmentation were used for dealing with highly imbalanced triage classes. All referrals were individually triaged by two dermatologists and then compared against the results generated from AI-assisted triage models. Performances were evaluated using Precision-Recall Area Under Curve (PR-AUC) and Receiver Operating Characteristic Area Under Curve (ROC-AUC).
Results
268 GP referrals to adult dermatology services were included. The knowledge-driven approach achieved the best performance (micro average PR-AUC of 0.907±0.006, ROC-AUC of 0.720 ± 0.010) compared to the baseline end-to-end deep learning model (micro average PR-AUC of 0.823±0.038, ROC-AUC of 0.616 ± 0.096) and the Long Short-Term Memory model (0.867±0.013, 0.600 ± 0.071). Imbalance preprocessing methods improved the model performance in some cases but not to a significant level. Combining all types of domain knowledge in AI models outperformed any subsets of these knowledge inputs.
Conclusions
The knowledge-enhanced AI approach showed promising results in achieving triage outcomes comparable to manual outcomes despite the limited data input from the referrals. AI-assisted triage has the potential to make the triaging process less time-consuming and more cost-effective, whilst retaining accuracy
Benchmarking and analyzing in-context learning, fine-tuning and supervised learning for biomedical knowledge curation: a focused study on chemical entities of biological interest
Automated knowledge curation for biomedical ontologies is key to ensure that they remain comprehensive, high-quality and up-to-date. In the era of foundational language models, this study compares and analyzes three NLP paradigms for curation tasks: in-context learning (ICL), fine-tuning (FT), and supervised learning (ML). Using the Chemical Entities of Biological Interest (ChEBI) database as a model ontology, three curation tasks were devised. For ICL, three prompting strategies were employed with GPT-4, GPT-3.5, BioGPT. PubmedBERT was chosen for the FT paradigm. For ML, six embedding models were utilized for training Random Forest and Long-Short Term Memory models. Five setups were designed to assess ML and FT model performance across different data availability scenarios.Datasets for curation tasks included: task 1 (620,386), task 2 (611,430), and task 3 (617,381), maintaining a 50:50 positive versus negative ratio. For ICL models, GPT-4 achieved best accuracy scores of 0.916, 0.766 and 0.874 for tasks 1-3 respectively. In a direct comparison, ML (trained on ~260,000 triples) outperformed ICL in accuracy across all tasks. (accuracy differences: +.11, +.22 and +.17). Fine-tuned PubmedBERT performed similarly to leading ML models in tasks 1 & 2 (F1 differences: -.014 and +.002), but worse in task 3 (-.048). Simulations revealed performance declines in both ML and FT models with smaller and higher imbalanced training data. where ICL (particularly GPT-4) excelled in tasks 1 & 3. GPT-4 excelled in tasks 1 and 3 with less than 6,000 triples, surpassing ML/FT. ICL underperformed ML/FT in task 2.ICL-augmented foundation models can be good assistants for knowledge curation with correct prompting, however, not making ML and FT paradigms obsolete. The latter two require task-specific data to beat ICL. In such cases, ML relies on small pretrained embeddings, minimizing computational demands
- âŠ