6,755 research outputs found
People on Drugs: Credibility of User Statements in Health Communities
Online health communities are a valuable source of information for patients
and physicians. However, such user-generated resources are often plagued by
inaccuracies and misinformation. In this work we propose a method for
automatically establishing the credibility of user-generated medical statements
and the trustworthiness of their authors by exploiting linguistic cues and
distant supervision from expert sources. To this end we introduce a
probabilistic graphical model that jointly learns user trustworthiness,
statement credibility, and language objectivity. We apply this methodology to
the task of extracting rare or unknown side-effects of medical drugs --- this
being one of the problems where large scale non-expert data has the potential
to complement expert medical knowledge. We show that our method can reliably
extract side-effects and filter out false statements, while identifying
trustworthy users that are likely to contribute valuable medical information
MR-GNN: Multi-Resolution and Dual Graph Neural Network for Predicting Structured Entity Interactions
Predicting interactions between structured entities lies at the core of
numerous tasks such as drug regimen and new material design. In recent years,
graph neural networks have become attractive. They represent structured
entities as graphs and then extract features from each individual graph using
graph convolution operations. However, these methods have some limitations: i)
their networks only extract features from a fix-sized subgraph structure (i.e.,
a fix-sized receptive field) of each node, and ignore features in substructures
of different sizes, and ii) features are extracted by considering each entity
independently, which may not effectively reflect the interaction between two
entities. To resolve these problems, we present MR-GNN, an end-to-end graph
neural network with the following features: i) it uses a multi-resolution based
architecture to extract node features from different neighborhoods of each
node, and, ii) it uses dual graph-state long short-term memory networks
(L-STMs) to summarize local features of each graph and extracts the interaction
features between pairwise graphs. Experiments conducted on real-world datasets
show that MR-GNN improves the prediction of state-of-the-art methods.Comment: Accepted by IJCAI 201
DHLP 1&2: Giraph based distributed label propagation algorithms on heterogeneous drug-related networks
Background and Objective: Heterogeneous complex networks are large graphs
consisting of different types of nodes and edges. The knowledge extraction from
these networks is complicated. Moreover, the scale of these networks is
steadily increasing. Thus, scalable methods are required. Methods: In this
paper, two distributed label propagation algorithms for heterogeneous networks,
namely DHLP-1 and DHLP-2 have been introduced. Biological networks are one type
of the heterogeneous complex networks. As a case study, we have measured the
efficiency of our proposed DHLP-1 and DHLP-2 algorithms on a biological network
consisting of drugs, diseases, and targets. The subject we have studied in this
network is drug repositioning but our algorithms can be used as general methods
for heterogeneous networks other than the biological network. Results: We
compared the proposed algorithms with similar non-distributed versions of them
namely MINProp and Heter-LP. The experiments revealed the good performance of
the algorithms in terms of running time and accuracy.Comment: Source code available for Apache Giraph on Hadoo
μ½λ¬Ό κ°μλ₯Ό μν λΉμ ν ν μ€νΈ λ΄ μμ μ 보 μΆμΆ μ°κ΅¬
νμλ
Όλ¬Έ(λ°μ¬) -- μμΈλνκ΅λνμ : μ΅ν©κ³ΌνκΈ°μ λνμ μμ©λ°μ΄μ€κ³΅νκ³Ό, 2023. 2. μ΄νκΈ°.Pharmacovigilance is a scientific activity to detect, evaluate and understand the occurrence of adverse drug events or other problems related to drug safety. However, concerns have been raised over the quality of drug safety information for pharmacovigilance, and there is also a need to secure a new data source to acquire drug safety information. On the other hand, the rise of pre-trained language models
based on a transformer architecture has accelerated the application of natural language processing (NLP) techniques in diverse domains. In this context, I tried to define two problems in pharmacovigilance as an NLP task and provide baseline models for the defined tasks: 1) extracting comprehensive drug safety information from adverse drug events narratives reported through a spontaneous reporting system (SRS) and 2) extracting drug-food interaction information from abstracts of biomedical articles. I developed annotation guidelines and performed manual annotation, demonstrating that strong NLP models can be trained to extracted clinical information from unstructrued free-texts by fine-tuning transformer-based language models on a high-quality annotated corpus. Finally, I discuss issues to consider when when developing annotation guidelines for extracting clinical information related to pharmacovigilance. The annotated corpora and the NLP models in this dissertation can streamline pharmacovigilance activities by enhancing the data quality of reported drug safety information and expanding the data sources.μ½λ¬Ό κ°μλ μ½λ¬Ό λΆμμ© λλ μ½λ¬Ό μμ μ±κ³Ό κ΄λ ¨λ λ¬Έμ μ λ°μμ κ°μ§, νκ° λ° μ΄ν΄νκΈ° μν κ³Όνμ νλμ΄λ€. κ·Έλ¬λ μ½λ¬Ό κ°μμ μ¬μ©λλ μμ½ν μμ μ± μ 보μ λ³΄κ³ νμ§μ λν μ°λ €κ° κΎΈμ€ν μ κΈ°λμμΌλ©°, ν΄λΉ λ³΄κ³ νμ§μ λμ΄κΈ° μν΄μλ μμ μ± μ 보λ₯Ό ν보ν μλ‘μ΄ μλ£μμ΄ νμνλ€. ννΈ νΈλμ€ν¬λ¨Έ μν€ν
μ²λ₯Ό κΈ°λ°μΌλ‘ μ¬μ νλ ¨ μΈμ΄λͺ¨λΈμ΄ λ±μ₯νλ©΄μ λ€μν λλ©μΈμμ μμ°μ΄μ²λ¦¬ κΈ°μ μ μ©μ΄ κ°μνλμλ€. μ΄λ¬ν λ§₯λ½μμ λ³Έ νμ λ
Όλ¬Έμμλ μ½λ¬Ό κ°μλ₯Ό μν λ€μ 2κ°μ§ μ 보 μΆμΆ λ¬Έμ λ₯Ό μμ°μ΄μ²λ¦¬ λ¬Έμ ννλ‘ μ μνκ³ κ΄λ ¨ κΈ°μ€ λͺ¨λΈμ κ°λ°νμλ€: 1) μλμ μ½λ¬Ό κ°μ 체κ³μ λ³΄κ³ λ μ΄μμ¬λ‘ μμ μλ£μμ ν¬κ΄μ μΈ μ½λ¬Ό μμ μ± μ 보λ₯Ό μΆμΆνλ€. 2) μλ¬Έ μμ½ν λ
Όλ¬Έ μ΄λ‘μμ μ½λ¬Ό-μν μνΈμμ© μ 보λ₯Ό μΆμΆνλ€. μ΄λ₯Ό μν΄ μμ μ± μ 보 μΆμΆμ μν μ΄λ
Έν
μ΄μ
κ°μ΄λλΌμΈμ κ°λ°νκ³ μμμ
μΌλ‘ μ΄λ
Έν
μ΄μ
μ μννμλ€. κ²°κ³Όμ μΌλ‘ κ³ νμ§μ μμ°μ΄ νμ΅λ°μ΄ν°λ₯Ό κΈ°λ°μΌλ‘ μ¬μ νμ΅ μΈμ΄λͺ¨λΈμ λ―ΈμΈ μ‘°μ ν¨μΌλ‘μ¨ λΉμ ν ν
μ€νΈμμ μμ μ 보λ₯Ό μΆμΆνλ κ°λ ₯ν μμ°μ΄μ²λ¦¬ λͺ¨λΈ κ°λ°μ΄ κ°λ₯ν¨μ νμΈνμλ€. λ§μ§λ§μΌλ‘ λ³Έ νμ λ
Όλ¬Έμμλ μ½λ¬Όκ°μμ κ΄λ ¨λμμ μ 보 μΆμΆμ μν μ΄λ
Έν
μ΄μ
κ°μ΄λλΌμΈμ κ°λ°ν λ κ³ λ €ν΄μΌ ν μ£Όμ μ¬νμ λν΄ λ
Όμνμλ€. λ³Έ νμ λ
Όλ¬Έμμ μκ°ν μμ°μ΄ νμ΅λ°μ΄ν°μ μμ°μ΄μ²λ¦¬ λͺ¨λΈμ μ½λ¬Ό μμ μ± μ 보μ λ³΄κ³ νμ§μ ν₯μμν€κ³ μλ£μμ νμ₯νμ¬ μ½λ¬Ό κ°μ νλμ 보쑰ν κ²μΌλ‘ κΈ°λλλ€.Chapter 1 1
1.1 Contributions of this dissertation 2
1.2 Overview of this dissertation 2
1.3 Other works 3
Chapter 2 4
2.1 Pharmacovigilance 4
2.2 Biomedical NLP for pharmacovigilance 6
2.2.1 Pre-trained language models 6
2.2.2 Corpora to extract clinical information for pharmacovigilance 9
Chapter 3 11
3.1 Motivation 12
3.2 Proposed Methods 14
3.2.1 Data source and text corpus 15
3.2.2 Annotation of ADE narratives 16
3.2.3 Quality control of annotation 17
3.2.4 Pretraining KAERS-BERT 18
3.2.6 Named entity recognition 20
3.2.7 Entity label classification and sentence extraction 21
3.2.8 Relation extraction 21
3.2.9 Model evaluation 22
3.2.10 Ablation experiment 23
3.3 Results 24
3.3.1 Annotated ICSRs 24
3.3.2 Corpus statistics 26
3.3.3 Performance of NLP models to extract drug safety information 28
3.3.4 Ablation experiment 31
3.4 Discussion 33
3.5 Conclusion 38
Chapter 4 39
4.1 Motivation 39
4.2 Proposed Methods 43
4.2.1 Data source 44
4.2.2 Annotation 45
4.2.3 Quality control of annotation 49
4.2.4 Baseline model development 49
4.3 Results 50
4.3.1 Corpus statistics 50
4.3.2 Annotation Quality 54
4.3.3 Performance of baseline models 55
4.3.4 Qualitative error analysis 56
4.4 Discussion 59
4.5 Conclusion 63
Chapter 5 64
5.1 Issues around defining a word entity 64
5.2 Issues around defining a relation between word entities 66
5.3 Issues around defining entity labels 68
5.4 Issues around selecting and preprocessing annotated documents 68
Chapter 6 71
6.1 Dissertation summary 71
6.2 Limitation and future works 72
6.2.1 Development of end-to-end information extraction models from free-texts to database based on existing structured information 72
6.2.2 Application of in-context learning framework in clinical information extraction 74
Chapter 7 76
7.1 Annotation Guideline for "Extraction of Comprehensive Drug Safety Information from Adverse Event Narratives Reported through Spontaneous Reporting System" 76
7.2 Annotation Guideline for "Extraction of Drug-Food Interactions from the Abtracts of Biomedical Articles" 100λ°
- β¦