Search CORE

3 research outputs found

고유명사 정규화 기법을 이용한 지식 그래프 구축

Author: 전성환
Publication venue: 서울대학교 대학원
Publication date: 01/02/2023
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 산업공학과, 2023. 2. 조성준.Text mining aims to extract the information from documents to derive valuable insights. The knowledge graph provides richer information from various documents. Past literature responded for such needs by building technology trees or concept network from the bibliographic information of the documents, or by relying on text mining techniques in order to extract keywords and/or phrases. In this paper, we propose a framework for building a knowledge graph using named entities. The knowledge graph construction framework in this paper satisfies the following conditions: (1) extracting the named entity in the completed form, (2) Building datasets that can be trained and be evaluated by the named entity normalization models in various domains such as finance and technical documents in addition to bio-informatics, where existing NEN research has been active, (3) creating the better performing named entity normalization model, and (4) constructing the knowledge graph by grouping named entities with the same meaning that appear in various forms.텍스트 마이닝은 다양한 인사이트를 얻기 위해 문서에서 정보를 추출하는 것을 목표로 한다. 문서의 정보를 표현하는 방식 중 하나인 지식 그래프는 다양한 문서에서 더욱 풍부한 정보를 제공한다. 기존 연구들은 텍스트 마이닝 기법을 이용하여 문서의 정보들로 기술 트리 또는 개념 네트워크를 구축하거나 키워드 및 구문을 추출하였다. 본 논문에 서는 고유명사를 이용하여 지식 그래프를 구축하기 위한 프레임워크를 제안한다. 본 논문의 지식 그래프 구축 프레임워크는 다음과 같은 조건을 만족한다. (1) 고유명사를 사람이 이해하기 쉬운 형태로 추출한다. (2) 기존 고유명사 정규화 연구가 활발했던 생물정보학 외에 금융 문서, 반도체 관련 특허 문서에서 추출한 고유명사로 고유명사 정규화 데이터셋을 구축한다. (3) 더 나은 성능의 고유명사 정규화 모델을 구축한다. (4) 다양한 형태의 동일한 의미를 가진 고유명사를 그룹화하여 지식 그래프를 구축한다.Chapter 1 Introduction 1 Chapter 2 Literature review 5 2.1 Named entity normalization dataset 5 2.2 Named entity normalization 6 2.3 Knowledge graph construction 9 Chapter 3 Dictionary construction for named entity normalization 11 3.1 Background 11 3.2 Dictionary construction methods 12 3.2.1 Finance named entity normalization dataset 12 3.2.2 Patent named entity normalization dataset 18 3.3 Chapter summary 24 Chapter 4 Named entity normalization model using edge weight updating neural network 26 4.1 Background 26 4.2 Proposed model 28 4.2.1 Ground truth entity graph construction 31 4.2.2 Similarity-based entity graph construction 32 4.2.3 Edge weight updating neural network training 35 4.2.4 Edge weight updating neural network inferencing 38 4.3 Experiment results 39 4.3.1 Datasets 39 4.3.2 Experiment settings: named entity normalization in bioinformatics 40 4.3.3 Experiment Settings: Named Entity Normalization in Finance 42 4.4 Results 44 4.4.1 Quantitative Analysis: Bioinformatics 45 4.4.2 QuantitativeAnalysis:Finance 46 4.4.3 QualitativeAnalysis 47 4.5 Chapter summary 51 Chapter 5 Building knowledge graph using named entity recognition and normalization models 53 5.1 Background 53 5.2 Proposed model 55 5.2.1 Named entity normalization 56 5.2.2 Construction of the semiconductor-related patent knowledge graph 61 5.3 Experiment results 62 5.3.1 Comparison models 62 5.3.2 Parameters ettings 64 5.4 Results 64 5.4.1 Quantitative evaluations 64 5.4.2 Qualitative evaluations 70 5.4.3 Knowledge graph visualization and exemplary investigation 71 5.5 Chapter summary 75 Chapter 6 Conclusion 77 6.1 Contributions 77 6.2 Future work 78 Bibliography 79 국문초록 92 감사의 글 93박

SNU Open Repository and Archive

Using a Human Drug Network for generating novel hypotheses about drugs

Author: Bender Andreas
Blockeel Hendrik
Rahmani Hossein
Publication venue: 'IOS Press'
Publication date: 01/01/2016
Field of study

Analyzing different drugs for various purposes is an important issue in the area of computational biology. We categorize the previous computational studies into Individual and Network approaches. While Individual approach focuses on one specific drug without considering its relationship with other drugs, Network approach considers also the drugs relationships. In this paper, we apply the previous Network approach for discovering the relationships among diseases on drug data. We construct a Human Drug Network (HDN) for 200 different drugs based on functional and structural information available in the PPI network. For evaluating our proposed HDN, first, we analyzed the literature to prove that the proposed HDN is biologically meaningful. Second, we used the HDN to augment the initial prior knowledge of different drugs. As an example of prior knowledge, we considered the initial seed proteins (a set of proteins which are previously known to be drug targets) of each drug. We clustered the HDN nodes using the Markov CLustering Algorithm (MCL) and then, we augmented the seed proteins of each drug based on the cluster it belongs to. In the end, we concluded that our proposed HDN enables us to generate novel Hypotheses (in terms of potential drug target proteins) and produce complementary results comparing to existing methods.status: publishe

Lirias

Maastricht University Research Portal

Leiden University Scholary Publications

Using a Human Drug Network for generating novel hypotheses about drugs

Author: Andreas Bender
Berger
Bleakley
Brohee
Brown
Campillos
Chautard
Chua
Cline
De Bie
Dezso
DiMasi
Domingos
Duda
Enright
Goehler
Hagadone
Hendrik Blockeel
Hert
Hormozdiari
Hossein Rahmani
Huang
Hwang
Jansen
Kearsley
Keiser
Klabunde
Kolárik
Kushwaha
Köhler
Li
Li
Lindsay
Lubovac
Ma'ayan
Mestres
Milenkovic
Milenkovic
Myers
Neduva
Oti
Park
Pujol
Radivojac
Ruffner
Schlicker
Schuffenhauer
Song
Stark
Stelzl
Vogt
Weber
Wishart
Wong
Wu
Xu
Xu
Yıldırım
Zhu
Publication venue: 'IOS Press'
Publication date
Field of study

Crossref