686 research outputs found
Interactive exploration of population scale pharmacoepidemiology datasets
Population-scale drug prescription data linked with adverse drug reaction
(ADR) data supports the fitting of models large enough to detect drug use and
ADR patterns that are not detectable using traditional methods on smaller
datasets. However, detecting ADR patterns in large datasets requires tools for
scalable data processing, machine learning for data analysis, and interactive
visualization. To our knowledge no existing pharmacoepidemiology tool supports
all three requirements. We have therefore created a tool for interactive
exploration of patterns in prescription datasets with millions of samples. We
use Spark to preprocess the data for machine learning and for analyses using
SQL queries. We have implemented models in Keras and the scikit-learn
framework. The model results are visualized and interpreted using live Python
coding in Jupyter. We apply our tool to explore a 384 million prescription data
set from the Norwegian Prescription Database combined with a 62 million
prescriptions for elders that were hospitalized. We preprocess the data in two
minutes, train models in seconds, and plot the results in milliseconds. Our
results show the power of combining computational power, short computation
times, and ease of use for analysis of population scale pharmacoepidemiology
datasets. The code is open source and available at:
https://github.com/uit-hdl/norpd_prescription_analyse
Recommended from our members
Adverse Drug Reaction Classification With Deep Neural Networks
We study the problem of detecting sentences describing adverse drug reactions (ADRs) and frame the problem as binary classification. We investigate different neural network (NN) architectures for ADR classification. In particular, we propose two new neural network models, Convolutional Recurrent Neural Network (CRNN) by concatenating convolutional neural networks with recurrent neural networks, and Convolutional Neural Network with Attention (CNNA) by adding attention weights into convolutional neural networks. We evaluate various NN architectures on a Twitter dataset containing informal language and an Adverse Drug Effects (ADE) dataset constructed by sampling from MEDLINE case reports. Experimental results show that all the NN architectures outperform the traditional maximum entropy classifiers trained from n-grams with different weighting strategies considerably on both datasets. On the Twitter dataset, all the NN architectures perform similarly. But on the ADE dataset, CNN performs better than other more complex CNN variants. Nevertheless, CNNA allows the visualisation of attention weights of words when making classification decisions and hence is more appropriate for the extraction of word subsequences describing ADRs
PRACTICAL IMPLICATIONS OF SPONTANEOUS ADVERSE DRUG REACTION REPORTING SYSTEM IN HOSPITALS-AN OVERVIEW
Adverse drug reactions (ADRs) are global problems of major concern which leads to morbidity and mortality. It causes 30 of hospitalized patients and lead 2-6 of all medical admissions. Spontaneous reporting of ADRs is the cornerstone of pharmacovigilance and is essential for maintaining patient safety. The necessity of a spontaneous ADR surveillance system is addressed by many authorities like World Health Organization, Food and Drug Administration, Joint Commission International and Uppsala monitoring center. However, existing postmarketing surveillance systems massively rely on spontaneous reports of ADRs which suffer from serious underreporting, latency, and inconsistent reporting. Studies estimated that only 6รขโฌโ10 of all ADRs are reported in hospitals. It is a very low percentage to go in deep and analyze the reason for the same and to resolve that underlying factors. Researchers proved that knowledge, attitude and false perceptions about the ADRs are the major challenges in the spontaneous reporting of ADRs. Which includes personal, professional, system related and organization related conflicts. Majority of them can improve by doing the system and personal targeted implications. Identifying, analyzing and working on these issues can improve the ADR surveillance system in hospitals to attain the patient safety. Understanding the pharmacovigilance, identifying and sorting out the obstacles of spontaneous reporting through an efficient pharmacovigilance department, continuous educational interventions, patient centered surveillance programs, health care team work efforts towards the detection of ADRs and implementation of the computer or personal assisted ADR trigger tool programs can furnish out a successful pharmacovigilance system in the hospitals and thereby we can constitute a good quality health care system.Key words: Spontaneous reporting system, adverse drug reaction, pharmacovigilance, Patient safet
๋ฅ ๋ด๋ด ๋คํธ์ํฌ๋ฅผ ํ์ฉํ ์ํ ๊ฐ๋ ๋ฐ ํ์ ํํ ํ์ต๊ณผ ์๋ฃ ๋ฌธ์ ์์ ์์ฉ
ํ์๋
ผ๋ฌธ(๋ฐ์ฌ) -- ์์ธ๋ํ๊ต๋ํ์ : ๊ณต๊ณผ๋ํ ์ ๊ธฐยท์ ๋ณด๊ณตํ๋ถ, 2022. 8. ์ ๊ต๋ฏผ.๋ณธ ํ์ ๋
ผ๋ฌธ์ ์ ๊ตญ๋ฏผ ์๋ฃ ๋ณดํ๋ฐ์ดํฐ์ธ ํ๋ณธ์ฝํธํธDB๋ฅผ ํ์ฉํ์ฌ ๋ฅ ๋ด๋ด ๋คํธ์ํฌ ๊ธฐ๋ฐ์ ์ํ ๊ฐ๋
๋ฐ ํ์ ํํ ํ์ต ๋ฐฉ๋ฒ๊ณผ ์๋ฃ ๋ฌธ์ ํด๊ฒฐ ๋ฐฉ๋ฒ์ ์ ์ํ๋ค. ๋จผ์ ์์ฐจ์ ์ธ ํ์ ์๋ฃ ๊ธฐ๋ก๊ณผ ๊ฐ์ธ ํ๋กํ์ผ ์ ๋ณด๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ํ์ ํํ์ ํ์ตํ๊ณ ํฅํ ์ง๋ณ ์ง๋จ ๊ฐ๋ฅ์ฑ์ ์์ธกํ๋ ์ฌ๊ท์ ๊ฒฝ๋ง ๋ชจ๋ธ์ ์ ์ํ์๋ค. ์ฐ๋ฆฌ๋ ๋ค์ํ ์ฑ๊ฒฉ์ ํ์ ์ ๋ณด๋ฅผ ํจ์จ์ ์ผ๋ก ํผํฉํ๋ ๊ตฌ์กฐ๋ฅผ ๋์
ํ์ฌ ํฐ ์ฑ๋ฅ ํฅ์์ ์ป์๋ค. ๋ํ ํ์์ ์๋ฃ ๊ธฐ๋ก์ ์ด๋ฃจ๋ ์๋ฃ ์ฝ๋๋ค์ ๋ถ์ฐ ํํ์ผ๋ก ๋ํ๋ด ์ถ๊ฐ ์ฑ๋ฅ ๊ฐ์ ์ ์ด๋ฃจ์๋ค. ์ด๋ฅผ ํตํด ์๋ฃ ์ฝ๋์ ๋ถ์ฐ ํํ์ด ์ค์ํ ์๊ฐ์ ์ ๋ณด๋ฅผ ๋ด๊ณ ์์์ ํ์ธํ์๊ณ , ์ด์ด์ง๋ ์ฐ๊ตฌ์์๋ ์ด๋ฌํ ์๊ฐ์ ์ ๋ณด๊ฐ ๊ฐํ๋ ์ ์๋๋ก ๊ทธ๋ํ ๊ตฌ์กฐ๋ฅผ ๋์
ํ์๋ค. ์ฐ๋ฆฌ๋ ์๋ฃ ์ฝ๋์ ๋ถ์ฐ ํํ ๊ฐ์ ์ ์ฌ๋์ ํต๊ณ์ ์ ๋ณด๋ฅผ ๊ฐ์ง๊ณ ๊ทธ๋ํ๋ฅผ ๊ตฌ์ถํ์๊ณ ๊ทธ๋ํ ๋ด๋ด ๋คํธ์ํฌ๋ฅผ ํ์ฉ, ์๊ฐ/ํต๊ณ์ ์ ๋ณด๊ฐ ๊ฐํ๋ ์๋ฃ ์ฝ๋์ ํํ ๋ฒกํฐ๋ฅผ ์ป์๋ค. ํ๋ํ ์๋ฃ ์ฝ๋ ๋ฒกํฐ๋ฅผ ํตํด ์ํ ์ฝ๋ฌผ์ ์ ์ฌ์ ์ธ ๋ถ์์ฉ ์ ํธ๋ฅผ ํ์งํ๋ ๋ชจ๋ธ์ ์ ์ํ ๊ฒฐ๊ณผ, ๊ธฐ์กด์ ๋ถ์์ฉ ๋ฐ์ดํฐ๋ฒ ์ด์ค์ ์กด์ฌํ์ง ์๋ ์ฌ๋ก๊น์ง๋ ์์ธกํ ์ ์์์ ๋ณด์๋ค. ๋ง์ง๋ง์ผ๋ก ๋ถ๋์ ๋นํด ์ฃผ์ ์ ๋ณด๊ฐ ํฌ์ํ๋ค๋ ์๋ฃ ๊ธฐ๋ก์ ํ๊ณ๋ฅผ ๊ทน๋ณตํ๊ธฐ ์ํด ์ง์๊ทธ๋ํ๋ฅผ ํ์ฉํ์ฌ ์ฌ์ ์ํ ์ง์์ ๋ณด๊ฐํ์๋ค. ์ด๋ ํ์์ ์๋ฃ ๊ธฐ๋ก์ ๊ตฌ์ฑํ๋ ์ง์๊ทธ๋ํ์ ๋ถ๋ถ๋ง์ ์ถ์ถํ์ฌ ๊ฐ์ธํ๋ ์ง์๊ทธ๋ํ๋ฅผ ๋ง๋ค๊ณ ๊ทธ๋ํ ๋ด๋ด ๋คํธ์ํฌ๋ฅผ ํตํด ๊ทธ๋ํ์ ํํ ๋ฒกํฐ๋ฅผ ํ๋ํ์๋ค. ์ต์ข
์ ์ผ๋ก ์์ฐจ์ ์ธ ์๋ฃ ๊ธฐ๋ก์ ํจ์ถํ ํ์ ํํ๊ณผ ๋๋ถ์ด ๊ฐ์ธํ๋ ์ํ ์ง์์ ํจ์ถํ ํํ์ ํจ๊ป ์ฌ์ฉํ์ฌ ํฅํ ์ง๋ณ ๋ฐ ์ง๋จ ์์ธก ๋ฌธ์ ์ ํ์ฉํ์๋ค.This dissertation proposes a deep neural network-based medical concept and patient representation learning methods using medical claims data to solve two healthcare tasks, i.e., clinical outcome prediction and post-marketing adverse drug reaction (ADR) signal detection. First, we propose SAF-RNN, a Recurrent Neural Network (RNN)-based model that learns a deep patient representation based on the clinical sequences and patient characteristics. Our proposed model fuses different types of patient records using feature-based gating and self-attention. We demonstrate that high-level associations between two heterogeneous records are effectively extracted by our model, thus achieving state-of-the-art performances for predicting the risk probability of cardiovascular disease. Secondly, based on the observation that the distributed medical code embeddings represent temporal proximity between the medical codes, we introduce a graph structure to enhance the code embeddings with such temporal information. We construct a graph using the distributed code embeddings and the statistical information from the claims data. We then propose the Graph Neural Network(GNN)-based representation learning for post-marketing ADR detection. Our model shows competitive performances and provides valid ADR candidates. Finally, rather than using patient records alone, we utilize a knowledge graph to augment the patient representation with prior medical knowledge. Using SAF-RNN and GNN, the deep patient representation is learned from the clinical sequences and the personalized medical knowledge. It is then used to predict clinical outcomes, i.e., next diagnosis prediction and CVD risk prediction, resulting in state-of-the-art performances.1 Introduction 1
2 Background 8
2.1 Medical Concept Embedding 8
2.2 Encoding Sequential Information in Clinical Records 11
3 Deep Patient Representation with Heterogeneous Information 14
3.1 Related Work 16
3.2 Problem Statement 19
3.3 Method 20
3.3.1 RNN-based Disease Prediction Model 20
3.3.2 Self-Attentive Fusion (SAF) Encoder 23
3.4 Dataset and Experimental Setup 24
3.4.1 Dataset 24
3.4.2 Experimental Design 26
ii 3.4.3 Implementation Details 27
3.5 Experimental Results 28
3.5.1 Evaluation of CVD Prediction 28
3.5.2 Sensitivity Analysis 28
3.5.3 Ablation Studies 31
3.6 Further Investigation 32
3.6.1 Case Study: Patient-Centered Analysis 32
3.6.2 Data-Driven CVD Risk Factors 32
3.7 Conclusion 33
4 Graph-Enhanced Medical Concept Embedding 40
4.1 Related Work 42
4.2 Problem Statement 43
4.3 Method 44
4.3.1 Code Embedding Learning with Skip-gram Model 44
4.3.2 Drug-disease Graph Construction 45
4.3.3 A GNN-based Method for Learning Graph Structure 47
4.4 Dataset and Experimental Setup 49
4.4.1 Dataset 49
4.4.2 Experimental Design 50
4.4.3 Implementation Details 52
4.5 Experimental Results 53
4.5.1 Evaluation of ADR Detection 53
4.5.2 Newly-Described ADR Candidates 54
4.6 Conclusion 55
5 Knowledge-Augmented Deep Patient Representation 57
5.1 Related Work 60
5.1.1 Incorporating Prior Medical Knowledge for Clinical Outcome Prediction 60
5.1.2 Inductive KGC based on Subgraph Learning 61
5.2 Method 61
5.2.1 Extracting Personalized KG 61
5.2.2 KA-SAF: Knowledge-Augmented Self-Attentive Fusion Encoder 64
5.2.3 KGC as a Pre-training Task 68
5.2.4 Subgraph Infomax: SGI 69
5.3 Dataset and Experimental Setup 72
5.3.1 Clinical Outcome Prediction 72
5.3.2 Next Diagnosis Prediction 72
5.4 Experimental Results 73
5.4.1 Cardiovascular Disease Prediction 73
5.4.2 Next Diagnosis Prediction 73
5.4.3 KGC on SemMed KG 73
5.5 Conclusion 74
6 Conclusion 77
Abstract (In Korean) 90
Acknowlegement 92๋ฐ
Pharmacovigilance in the European Union: Practical Implementation across Member States
Comparative Politics; Political Economy; European Union Politics; Drug Safety and Pharmacovigilanc
Knowledge graph prediction of unknown adverse drug reactions and validation in electronic health records
Unknown adverse reactions to drugs available on the market present a significant health risk and limit accurate judgement of the cost/benefit trade-off for medications. Machine learning has the potential to predict unknown adverse reactions from current knowledge. We constructed a knowledge graph containing four types of node: drugs, protein targets, indications and adverse reactions. Using this graph, we developed a machine learning algorithm based on a simple enrichment test and first demonstrated this method performs extremely well at classifying known causes of adverse reactions (AUC 0.92). A cross validation scheme in which 10% of drug-adverse reaction edges were systematically deleted per fold showed that the method correctly predicts 68% of the deleted edges on average. Next, a subset of adverse reactions that could be reliably detected in anonymised electronic health records from South London and Maudsley NHS Foundation Trust were used to validate predictions from the model that are not currently known in public databases. High-confidence predictions were validated in electronic records significantly more frequently than random models, and outperformed standard methods (logistic regression, decision trees and support vector machines). This approach has the potential to improve patient safety by predicting adverse reactions that were not observed during randomised trials
Knowledge graph prediction of unknown adverse drug reactions and validation in electronic health records
Abstract Unknown adverse reactions to drugs available on the market present a significant health risk and limit accurate judgement of the cost/benefit trade-off for medications. Machine learning has the potential to predict unknown adverse reactions from current knowledge. We constructed a knowledge graph containing four types of node: drugs, protein targets, indications and adverse reactions. Using this graph, we developed a machine learning algorithm based on a simple enrichment test and first demonstrated this method performs extremely well at classifying known causes of adverse reactions (AUC 0.92). A cross validation scheme in which 10% of drug-adverse reaction edges were systematically deleted per fold showed that the method correctly predicts 68% of the deleted edges on average. Next, a subset of adverse reactions that could be reliably detected in anonymised electronic health records from South London and Maudsley NHS Foundation Trust were used to validate predictions from the model that are not currently known in public databases. High-confidence predictions were validated in electronic records significantly more frequently than random models, and outperformed standard methods (logistic regression, decision trees and support vector machines). This approach has the potential to improve patient safety by predicting adverse reactions that were not observed during randomised trials
- โฆ