Search CORE

Knowledge graph prediction of unknown adverse drug reactions and validation in electronic health records

Author: Bean Daniel M
Broadbent Matthew
Dobson Richard J B
Dzahini Olubanke
Ibrahim Zina M
Iqbal Ehtesham
Stewart Robert
Wu Honghan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2017
Field of study

Abstract Unknown adverse reactions to drugs available on the market present a significant health risk and limit accurate judgement of the cost/benefit trade-off for medications. Machine learning has the potential to predict unknown adverse reactions from current knowledge. We constructed a knowledge graph containing four types of node: drugs, protein targets, indications and adverse reactions. Using this graph, we developed a machine learning algorithm based on a simple enrichment test and first demonstrated this method performs extremely well at classifying known causes of adverse reactions (AUC 0.92). A cross validation scheme in which 10% of drug-adverse reaction edges were systematically deleted per fold showed that the method correctly predicts 68% of the deleted edges on average. Next, a subset of adverse reactions that could be reliably detected in anonymised electronic health records from South London and Maudsley NHS Foundation Trust were used to validate predictions from the model that are not currently known in public databases. High-confidence predictions were validated in electronic records significantly more frequently than random models, and outperformed standard methods (logistic regression, decision trees and support vector machines). This approach has the potential to improve patient safety by predicting adverse reactions that were not observed during randomised trials

Directory of Open Access Journals

ADEPt, a semantically-enriched pipeline for extracting adverse drug events from free-text electronic health records

Author: Broadbent Matthew
Chang Nynn
Dobson Richard J.B.
Dzahini Olubanke
Ibrahim Zina M.
Iqbal Ehtesham
Mallah Robbie
Pandey Chandra
Rhodes Daniel
Romero Alvin
Stewart Robert
Wu Honghan
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 09/11/2017
Field of study

Adverse drug events (ADEs) are unintended responses to medical treatment. They can greatly affect a patient's quality of life and present a substantial burden on healthcare. Although Electronic health records (EHRs) document a wealth of information relating to ADEs, they are frequently stored in the unstructured or semi-structured free-text narrative requiring Natural Language Processing (NLP) techniques to mine the relevant information. Here we present a rule-based ADE detection and classification pipeline built and tested on a large Psychiatric corpus comprising 264k patients using the de-identified EHRs of four UK-based psychiatric hospitals. The pipeline uses characteristics specific to Psychiatric EHRs to guide the annotation process, and distinguishes: a) the temporal value associated with the ADE mention (whether it is historical or present), b) the categorical value of the ADE (whether it is assertive, hypothetical, retrospective or a general discussion) and c) the implicit contextual value where the status of the ADE is deduced from surrounding indicators, rather than explicitly stated. We manually created the rulebase in collaboration with clinicians and pharmacists by studying ADE mentions in various types of clinical notes. We evaluated the open-source Adverse Drug Event annotation Pipeline (ADEPt) using 19 ADEs specific to antipsychotics and antidepressants medication. The ADEs chosen vary in severity, regularity and persistence. The average F-measure and accuracy achieved by our tool across all tested ADEs were 0.83 and 0.83 respectively. In addition to annotation power, the ADEPT pipeline presents an improvement to the state of the art context-discerning algorithm, ConText

Directory of Open Access Journals

arXiv.org e-Print Archive

The side effect profile of Clozapine in real world data of three large mental hospitals

Author: Broadbent Matthew
Dobson Richard J.B.
Dzahini Olubanke
Govind Risha
Ibrahim Zina M.
Iqbal Ehtesham
Kim Chi Hun
MacCabe James H.
Romero Alvin
Smith Tanya
Stewart Robert
Werbeloff Nomi
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

Objective: Mining the data contained within Electronic Health Records (EHRs) can potentially generate a greater understanding of medication effects in the real world, complementing what we know from Randomised control trials (RCTs). We Propose a text mining approach to detect adverse events and medication episodes from the clinical text to enhance our understanding of adverse effects related to Clozapine, the most effective antipsychotic drug for the management of treatment-resistant schizophrenia, but underutilised due to concerns over its side effects. Material and Methods: We used data from de-identified EHRs of three mental health trusts in the UK (>50 million documents, over 500,000 patients, 2835 of which were prescribed Clozapine). We explored the prevalence of 33 adverse effects by age, gender, ethnicity, smoking status and admission type three months before and after the patients started Clozapine treatment. We compared the prevalence of adverse effects with those reported in the Side Effects Resource (SIDER) where possible. Results: Sedation, fatigue, agitation, dizziness, hypersalivation, weight gain, tachycardia, headache, constipation and confusion were amongst the highest recorded Clozapine adverse effect in the three months following the start of treatment. Higher percentages of all adverse effects were found in the first month of Clozapine therapy. Using a significance level of (p< 0.05) out chi-square tests show a significant association between most of the ADRs in smoking status and hospital admissions and some in gender and age groups. Further, the data was combined from three trusts, and chi-square tests were applied to estimate the average effect of ADRs in each monthly interval. Conclusion: A better understanding of how the drug works in the real world can complement clinical trials and precision medicine

Directory of Open Access Journals

arXiv.org e-Print Archive

Efficient Reuse of Natural Language Processing Models for Phenotype-Mention Identification in Free-text Electronic Medical Records: A Phenotype Embedding Approach.

Author: Dobson Richard J.B.
Dyson Sue
Hodgson Karen
Ibrahim Zina M.
Iqbal Ehtesham
Morley Katherine I.
Stewart Robert
Sudlow Cathie
Wu Honghan
Publication venue: 'JMIR Publications Inc.'
Publication date: 01/01/2019
Field of study

Background: Many efforts have been put into the use of automated approaches, such as natural language processing (NLP), to mine or extract data from free-text medical records to construct comprehensive patient profiles for delivering better health-care. Reusing NLP models in new settings, however, remains cumbersome - requiring validation and/or retraining on new data iteratively to achieve convergent results. Objective: The aim of this work is to minimize the effort involved in reusing NLP models on free-text medical records. Methods: We formally define and analyse the model adaptation problem in phenotype-mention identification tasks. We identify "duplicate waste" and "imbalance waste", which collectively impede efficient model reuse. We propose a phenotype embedding based approach to minimize these sources of waste without the need for labelled data from new settings. Results: We conduct experiments on data from a large mental health registry to reuse NLP models in four phenotype-mention identification tasks. The proposed approach can choose the best model for a new task, identifying up to 76% (duplicate waste), i.e. phenotype mentions without the need for validation and model retraining, and with very good performance (93-97% accuracy). It can also provide guidance for validating and retraining the selected model for novel language patterns in new tasks, saving around 80% (imbalance waste), i.e. the effort required in "blind" model-adaptation approaches. Conclusions: Adapting pre-trained NLP models for new tasks can be more efficient and effective if the language pattern landscapes of old settings and new settings can be made explicit and comparable. Our experiments show that the phenotype-mention embedding approach is an effective way to model language patterns for phenotype-mention identification tasks and that its use can guide efficient NLP model reuse

SemEHR:A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research

Author: Abhyankar
Amos Folarin
Angus Roberts
Asha Agrawal
Auer
Bilal
Bodenreider
Botsis
Clive Stringer
Cresswell
Cresswell
Darren Gale
Genevieve Gorrell
Giulia Toti
Hebbring
Honghan Wu
Iqbal
Ismail Kartoglu
Jackson
Jackson
Jackson MSc
Johnson
Kadra
Katherine I Morley
Lindberg
Mathias
Matthew Broadbent
Moseley
Nair
Pawloski
Richard Jackson
Richard JB Dobson
Robert Stewart
Scheurwegs
Stewart
Vrandečić
Warner
Wu
Zina M Ibrahim
Publication venue: 'Oxford University Press (OUP)'
Publication date: 08/01/2018
Field of study

OBJECTIVE: Unlocking the data contained within both structured and unstructured components of electronic health records (EHRs) has the potential to provide a step change in data available for secondary research use, generation of actionable medical insights, hospital management, and trial recruitment. To achieve this, we implemented SemEHR, an open source semantic search and analytics tool for EHRs. METHODS: SemEHR implements a generic information extraction (IE) and retrieval infrastructure by identifying contextualized mentions of a wide range of biomedical concepts within EHRs. Natural language processing annotations are further assembled at the patient level and extended with EHR-specific knowledge to generate a timeline for each patient. The semantic data are serviced via ontology-based search and analytics interfaces. RESULTS: SemEHR has been deployed at a number of UK hospitals, including the Clinical Record Interactive Search, an anonymized replica of the EHR of the UK South London and Maudsley National Health Service Foundation Trust, one of Europe's largest providers of mental health services. In 2 Clinical Record Interactive Search-based studies, SemEHR achieved 93% (hepatitis C) and 99% (HIV) F-measure results in identifying true positive patients. At King's College Hospital in London, as part of the CogStack program (github.com/cogstack), SemEHR is being used to recruit patients into the UK Department of Health 100 000 Genomes Project (genomicsengland.co.uk). The validation study suggests that the tool can validate previously recruited cases and is very fast at searching phenotypes; time for recruitment criteria checking was reduced from days to minutes. Validated on open intensive care EHR data, Medical Information Mart for Intensive Care III, the vital signs extracted by SemEHR can achieve around 97% accuracy. CONCLUSION: Results from the multiple case studies demonstrate SemEHR's efficiency: weeks or months of work can be done within hours or minutes in some cases. SemEHR provides a more comprehensive view of patients, bringing in more and unexpected insight compared to study-oriented bespoke IE systems. SemEHR is open source, available at https://github.com/CogStack/SemEHR

University of Melbourne Institutional Repository

The relative vertex clustering value - a new criterion for the fast discovery of functional modules in protein interaction networks

Author: A Clauset
Alioune Ngom
C Pizzuti
CV Mering
D Huang
E Becker
E Hartuv
F Brucker
F Luo
F Radicchi
G Dennis
HN Chua
I Xenarios
JP Bagrow
LH Hartwell
M Girvan
M Li
MS Rahman
N Zaki
P Pei
R-S Wang
S Chen
S Fortunato
S Yook
TV Laarhoven
V Spirin
VD Blondel
XL Li
Zina M Ibrahim
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Detecting epistasis in the presence of linkage disequilibrium: A focused comparison

Author: Dobson Richard
Ibrahim Zina M
Newhouse Stephen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study