1 research outputs found
λ₯ λ΄λ΄ λ€νΈμν¬λ₯Ό νμ©ν μν κ°λ λ° νμ νν νμ΅κ³Ό μλ£ λ¬Έμ μμ μμ©
νμλ
Όλ¬Έ(λ°μ¬) -- μμΈλνκ΅λνμ : 곡과λν μ κΈ°Β·μ 보곡νλΆ, 2022. 8. μ κ΅λ―Ό.λ³Έ νμ λ
Όλ¬Έμ μ κ΅λ―Ό μλ£ λ³΄νλ°μ΄ν°μΈ νλ³Έμ½νΈνΈDBλ₯Ό νμ©νμ¬ λ₯ λ΄λ΄ λ€νΈμν¬ κΈ°λ°μ μν κ°λ
λ° νμ νν νμ΅ λ°©λ²κ³Ό μλ£ λ¬Έμ ν΄κ²° λ°©λ²μ μ μνλ€. λ¨Όμ μμ°¨μ μΈ νμ μλ£ κΈ°λ‘κ³Ό κ°μΈ νλ‘νμΌ μ 보λ₯Ό κΈ°λ°μΌλ‘ νμ ννμ νμ΅νκ³ ν₯ν μ§λ³ μ§λ¨ κ°λ₯μ±μ μμΈ‘νλ μ¬κ·μ κ²½λ§ λͺ¨λΈμ μ μνμλ€. μ°λ¦¬λ λ€μν μ±κ²©μ νμ μ 보λ₯Ό ν¨μ¨μ μΌλ‘ νΌν©νλ ꡬ쑰λ₯Ό λμ
νμ¬ ν° μ±λ₯ ν₯μμ μ»μλ€. λν νμμ μλ£ κΈ°λ‘μ μ΄λ£¨λ μλ£ μ½λλ€μ λΆμ° ννμΌλ‘ λνλ΄ μΆκ° μ±λ₯ κ°μ μ μ΄λ£¨μλ€. μ΄λ₯Ό ν΅ν΄ μλ£ μ½λμ λΆμ° ννμ΄ μ€μν μκ°μ μ 보λ₯Ό λ΄κ³ μμμ νμΈνμκ³ , μ΄μ΄μ§λ μ°κ΅¬μμλ μ΄λ¬ν μκ°μ μ λ³΄κ° κ°νλ μ μλλ‘ κ·Έλν ꡬ쑰λ₯Ό λμ
νμλ€. μ°λ¦¬λ μλ£ μ½λμ λΆμ° νν κ°μ μ μ¬λμ ν΅κ³μ μ 보λ₯Ό κ°μ§κ³ κ·Έλνλ₯Ό ꡬμΆνμκ³ κ·Έλν λ΄λ΄ λ€νΈμν¬λ₯Ό νμ©, μκ°/ν΅κ³μ μ λ³΄κ° κ°νλ μλ£ μ½λμ νν 벑ν°λ₯Ό μ»μλ€. νλν μλ£ μ½λ 벑ν°λ₯Ό ν΅ν΄ μν μ½λ¬Όμ μ μ¬μ μΈ λΆμμ© μ νΈλ₯Ό νμ§νλ λͺ¨λΈμ μ μν κ²°κ³Ό, κΈ°μ‘΄μ λΆμμ© λ°μ΄ν°λ² μ΄μ€μ μ‘΄μ¬νμ§ μλ μ¬λ‘κΉμ§λ μμΈ‘ν μ μμμ 보μλ€. λ§μ§λ§μΌλ‘ λΆλμ λΉν΄ μ£Όμ μ λ³΄κ° ν¬μνλ€λ μλ£ κΈ°λ‘μ νκ³λ₯Ό 극볡νκΈ° μν΄ μ§μκ·Έλνλ₯Ό νμ©νμ¬ μ¬μ μν μ§μμ 보κ°νμλ€. μ΄λ νμμ μλ£ κΈ°λ‘μ ꡬμ±νλ μ§μκ·Έλνμ λΆλΆλ§μ μΆμΆνμ¬ κ°μΈνλ μ§μκ·Έλνλ₯Ό λ§λ€κ³ κ·Έλν λ΄λ΄ λ€νΈμν¬λ₯Ό ν΅ν΄ κ·Έλνμ νν 벑ν°λ₯Ό νλνμλ€. μ΅μ’
μ μΌλ‘ μμ°¨μ μΈ μλ£ κΈ°λ‘μ ν¨μΆν νμ ννκ³Ό λλΆμ΄ κ°μΈνλ μν μ§μμ ν¨μΆν ννμ ν¨κ» μ¬μ©νμ¬ ν₯ν μ§λ³ λ° μ§λ¨ μμΈ‘ λ¬Έμ μ νμ©νμλ€.This dissertation proposes a deep neural network-based medical concept and patient representation learning methods using medical claims data to solve two healthcare tasks, i.e., clinical outcome prediction and post-marketing adverse drug reaction (ADR) signal detection. First, we propose SAF-RNN, a Recurrent Neural Network (RNN)-based model that learns a deep patient representation based on the clinical sequences and patient characteristics. Our proposed model fuses different types of patient records using feature-based gating and self-attention. We demonstrate that high-level associations between two heterogeneous records are effectively extracted by our model, thus achieving state-of-the-art performances for predicting the risk probability of cardiovascular disease. Secondly, based on the observation that the distributed medical code embeddings represent temporal proximity between the medical codes, we introduce a graph structure to enhance the code embeddings with such temporal information. We construct a graph using the distributed code embeddings and the statistical information from the claims data. We then propose the Graph Neural Network(GNN)-based representation learning for post-marketing ADR detection. Our model shows competitive performances and provides valid ADR candidates. Finally, rather than using patient records alone, we utilize a knowledge graph to augment the patient representation with prior medical knowledge. Using SAF-RNN and GNN, the deep patient representation is learned from the clinical sequences and the personalized medical knowledge. It is then used to predict clinical outcomes, i.e., next diagnosis prediction and CVD risk prediction, resulting in state-of-the-art performances.1 Introduction 1
2 Background 8
2.1 Medical Concept Embedding 8
2.2 Encoding Sequential Information in Clinical Records 11
3 Deep Patient Representation with Heterogeneous Information 14
3.1 Related Work 16
3.2 Problem Statement 19
3.3 Method 20
3.3.1 RNN-based Disease Prediction Model 20
3.3.2 Self-Attentive Fusion (SAF) Encoder 23
3.4 Dataset and Experimental Setup 24
3.4.1 Dataset 24
3.4.2 Experimental Design 26
ii 3.4.3 Implementation Details 27
3.5 Experimental Results 28
3.5.1 Evaluation of CVD Prediction 28
3.5.2 Sensitivity Analysis 28
3.5.3 Ablation Studies 31
3.6 Further Investigation 32
3.6.1 Case Study: Patient-Centered Analysis 32
3.6.2 Data-Driven CVD Risk Factors 32
3.7 Conclusion 33
4 Graph-Enhanced Medical Concept Embedding 40
4.1 Related Work 42
4.2 Problem Statement 43
4.3 Method 44
4.3.1 Code Embedding Learning with Skip-gram Model 44
4.3.2 Drug-disease Graph Construction 45
4.3.3 A GNN-based Method for Learning Graph Structure 47
4.4 Dataset and Experimental Setup 49
4.4.1 Dataset 49
4.4.2 Experimental Design 50
4.4.3 Implementation Details 52
4.5 Experimental Results 53
4.5.1 Evaluation of ADR Detection 53
4.5.2 Newly-Described ADR Candidates 54
4.6 Conclusion 55
5 Knowledge-Augmented Deep Patient Representation 57
5.1 Related Work 60
5.1.1 Incorporating Prior Medical Knowledge for Clinical Outcome Prediction 60
5.1.2 Inductive KGC based on Subgraph Learning 61
5.2 Method 61
5.2.1 Extracting Personalized KG 61
5.2.2 KA-SAF: Knowledge-Augmented Self-Attentive Fusion Encoder 64
5.2.3 KGC as a Pre-training Task 68
5.2.4 Subgraph Infomax: SGI 69
5.3 Dataset and Experimental Setup 72
5.3.1 Clinical Outcome Prediction 72
5.3.2 Next Diagnosis Prediction 72
5.4 Experimental Results 73
5.4.1 Cardiovascular Disease Prediction 73
5.4.2 Next Diagnosis Prediction 73
5.4.3 KGC on SemMed KG 73
5.5 Conclusion 74
6 Conclusion 77
Abstract (In Korean) 90
Acknowlegement 92λ°