7 research outputs found

    A Study on Drug Repositioning for Rare Diseases based on Biological Pathways

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์น˜์˜ํ•™๋Œ€ํ•™์› ์น˜์˜๊ณผํ•™๊ณผ, 2020. 8. ๊น€ํ™๊ธฐ.์„œ๋ก : ๋ณธ ์—ฐ๊ตฌ๋Š” ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ํฌ๊ท€์งˆํ™˜์˜ ์‹ ์•ฝ์žฌ์ฐฝ์ถœ ๋ฐฉ๋ฒ•๋ก  ์—ฐ๊ตฌ๋ฅผ ๋ชฉ์ ์œผ๋กœ ํ•œ๋‹ค. ์ „ ์„ธ๊ณ„์ ์œผ๋กœ 7,000์—ฌ ๊ฐœ์˜ ํฌ๊ท€์งˆํ™˜์ด ์กด์žฌํ•˜์ง€๋งŒ, ์น˜๋ฃŒ์ œ๋Š” ์•ฝ 5% ์ •๋„๋งŒ ์กด์žฌํ•ด ๋” ๋งŽ์€ ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ํฌ๊ท€์งˆํ™˜ ์น˜๋ฃŒ์ œ ์—ฐ๊ตฌ์—๋Š” ์ „ํ†ต์ ์ธ ์‹ ์•ฝ๊ฐœ๋ฐœ ์—ฐ๊ตฌ๋ณด๋‹ค๋Š” ์ด๋ฏธ ์Šน์ธ๋œ ์•ฝ๋ฌผ์˜ ์ƒˆ๋กœ์šด ์˜ํ•™์  ์šฉ๋„๋ฅผ ์ฐพ๋Š” ์‹ ์•ฝ์žฌ์ฐฝ์ถœ์ด ์‹œ๊ฐ„๊ณผ ๋น„์šฉ์ด ์ค„์–ด ๋Œ€์•ˆ์ด ๋  ์ˆ˜ ์žˆ๋‹ค. ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด๋Š” ์ƒ์ฒด์š”์†Œ ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ์ƒ์„ธํžˆ ์„ค๋ช…ํ•ด๋‘” ์ƒ๋ฌผํ•™์  ์‹ฌ์ธต ์ง€์‹์œผ๋กœ ์œ ์ „์ž๋“ค์˜ ์ •๋ณด๋ฅผ ์œ ๊ธฐ์ ์œผ๋กœ ๋ฐ”๋ผ๋ณผ ๋•Œ ์‚ฌ์šฉ๋œ๋‹ค. ๋”ฐ๋ผ์„œ ์‹ ์•ฝ์žฌ์ฐฝ์ถœ์„ ์œ„ํ•ด ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด๋Š” ํ™œ์šฉํ•˜๊ธฐ์— ์ ํ•ฉํ•˜๋‹ค. ํฌ๊ท€์งˆํ™˜์˜ ์‹ ์•ฝ์žฌ์ฐฝ์ถœ ์•ฝ๋ฌผ ํ›„๋ณด๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด ์•ฝ๋ฌผ๊ด€๋ จ ์œ ์ „์ž๋“ค์˜ ์ •๋ณด์™€ ํฌ๊ท€์งˆํ™˜ ๊ด€๋ จ ์œ ์ „์ž์ •๋ณด๋ฅผ ๋ถ„์„ํ•˜์—ฌ ๊ณตํ†ต ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด ๋ชฉ๋ก์„ ํ™œ์šฉํ•œ๋‹ค. ๊ณตํ†ต ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด๋กœ ๋งŒ๋“ค์–ด์ง„ ํฌ๊ท€์งˆํ™˜๊ณผ ์•ฝ๋ฌผ์˜ ์œ ์‚ฌ๋„๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ํฌ๊ท€์งˆํ™˜-์•ฝ๋ฌผ ํ›„๋ณด๋ชฉ๋ก์„ ๋งŒ๋“ ๋‹ค. ๋ฐฉ๋ฒ•: ํฌ๊ท€์งˆํ™˜ ์œ ์ „์ž์ •๋ณด ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค Panel์˜ ํฌ๊ท€์งˆํ™˜ 309๊ฐœ์™€ ์œ ์˜๋ฏธํ•œ ๊ด€๋ จ์„ฑ์„ ๊ฐ€์ง„ ์œ ์ „์ž์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜์˜€๋‹ค. ์•ฝ๋ฌผ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋กœ DRUGBANK๋ฅผ ์‚ฌ์šฉํ•˜์˜€์œผ๋ฉฐ, 1888๊ฐœ์˜ ์Šน์ธ๋œ ์•ฝ๋ฌผ๊ณผ ๊ด€๋ จ๋œ ์œ ์ „์ž์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ํŒจ์Šค์›จ์ด ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋กœ Reactome์—์„œ ์ œ๊ณตํ•˜๋Š” ๋ถ„์„ ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํฌ๊ท€์งˆํ™˜๊ณผ ์•ฝ๋ฌผ์— ๊ด€๋ จ๋œ ์œ ์ „์ž ๋ชฉ๋ก์˜ ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด๋ฅผ FDR ๊ฐ’์„ ๊ธฐ์ค€์œผ๋กœ ๊ฐ๊ฐ ์ˆ˜์ง‘ํ•˜์˜€๋‹ค. ์ˆ˜์ง‘ํ•œ ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด๋“ค ์ค‘ ํฌ๊ท€์งˆํ™˜๊ณผ ์•ฝ๋ฌผ์— ๊ณตํ†ต์œผ๋กœ ๊ด€๋ จ๋œ ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด๋Š” 1883๊ฐœ๋กœ, ํฌ๊ท€์งˆํ™˜๊ณผ ์•ฝ๋ฌผ์˜ ์œ ์‚ฌ๋„๋ฅผ ํ™•์ธ์„ ์œ„ํ•ด ํ™œ์šฉ๋œ๋‹ค. ํฌ๊ท€์งˆํ™˜๊ณผ ์•ฝ๋ฌผ์˜ ์œ ์‚ฌ๋„๋Š” FDR๊ฐ’์„ ๋ฒกํ„ฐํ™”ํ•˜์—ฌ ์œ ํด๋ฆฌ๋””์•ˆ ์œ ์‚ฌ๋„๋กœ ๊ณ„์‚ฐํ•˜์˜€๋‹ค. ๊ฒฐ๊ณผ: ๋ณธ ์—ฐ๊ตฌ ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด ํฌ๊ท€์งˆํ™˜-์•ฝ๋ฌผ ํ›„๋ณด๋ชฉ๋ก์„ ๋งŒ๋“ค์—ˆ๋‹ค. ํฌ๊ท€์งˆํ™˜โ€“์•ฝ๋ฌผ์˜ ์œ ์‚ฌ๋„ ๊ฒฐ๊ณผ๋กœ ๋‚˜์˜จ ๊ฐ’์ด ์ž‘์€ ๊ฐ’์ผ์ˆ˜๋ก ์„œ๋กœ ๊ฐ€๊นŒ์šด ๊ฑฐ๋ฆฌ์— ์กด์žฌํ•œ๋‹ค๊ณ  ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์œ ์‚ฌ๋„ ๊ฐ’์€ ํฌ๊ท€์งˆํ™˜์˜ ์‹ ์•ฝ์žฌ์ฐฝ์ถœ ํ›„๋ณด๊ฐ€ ๋  ๊ฐ€๋Šฅ์„ฑ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์ด๋ฅผ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด FDR ์Šน์ธ๋˜์–ด ํฌ๊ท€์งˆํ™˜ ์น˜๋ฃŒ์ œ๋กœ ์“ฐ์ด๋Š” ์•ฝ ์ •๋ณด์™€ ๊ฐ’์„ ๋น„๊ตํ•˜์˜€๋‹ค. Lomitapide ์•ฝ๋ฌผ์€ Homozygous familial hypercholesterolemia ์งˆ๋ณ‘ ์น˜๋ฃŒ์ œ๋กœ, ์œ ์‚ฌ๋„ ๊ฐ’์ด 2.8๋กœ 309๊ฐœ์˜ ํฌ๊ท€์งˆํ™˜ ์ค‘ 34๋ฒˆ์งธ๋กœ ์•ฝ๋ฌผ-ํฌ๊ท€์งˆํ™˜ ๋ชฉ๋ก์— Familial hypercholesterolaemia targeted panel์˜ ์ด๋ฆ„์œผ๋กœ ์กด์žฌํ–ˆ๋‹ค. ํฌ๊ท€์งˆํ™˜-ํŒจ์Šค์›จ์ด-์•ฝ๋ฌผ๋กœ ๋ถ„์„ํ•œ ๊ฒฐ๊ณผ๊ฐ€ ํฌ๊ท€์งˆํ™˜-์œ ์ „์ž-์•ฝ๋ฌผ์˜ ๊ด€๊ณ„๋ณด๋‹ค ์œ ์˜๋ฏธํ•œ ๊ฒฐ๊ณผ๋ผ๋Š” ๊ฒƒ์„ Thalidomide ์•ฝ์„ ํ†ตํ•ด ๋น„๊ตํ•ด๋ณด์•˜๋‹ค. ํฌ๊ท€์งˆํ™˜-ํŒจ์Šค์›จ์ด-์•ฝ๋ฌผ์—์„œ Thalidomide ์น˜๋ฃŒ์ œ๊ฐ€ ์–ด๋–ค ํฌ๊ท€์งˆํ™˜์— ๊ด€๋ จ์„ฑ์ด๋†’์€์ง€๋ฅผ ์ˆœ์„œ๋Œ€๋กœ ๋ณผ ์ˆ˜ ์žˆ์—ˆ๊ณ , Bladder cancer pertinent cancer susceptibility๋ผ๋Š” ํฌ๊ท€์งˆํ™˜์ด ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ๊ฒƒ์œผ๋กœ ํ™•์ธ๋˜์—ˆ๊ณ , ๊ด€๋ จ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์—ˆ์—ˆ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๊ฒฐ๋ก : ํฌ๊ท€์งˆํ™˜-์•ฝ๋ฌผ ๋ชฉ๋ก์ด ํฌ๊ท€์งˆํ™˜ ์น˜๋ฃŒ์ œ์™€์˜ ๋น„๊ต๋ฅผ ํ†ตํ•ด ์—ฐ๊ด€์„ฑ์ด ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๊ณ , ์šฐ์„ ์ˆœ์œ„๋ชฉ๋ก์„ ํ†ตํ•ด ์–ผ๋งˆ๋‚˜ ์—ฐ๊ด€์„ฑ์ด ์žˆ๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋˜ํ•œ, ํฌ๊ท€์งˆํ™˜-์œ ์ „์ž-์•ฝ๋ฌผ์˜ ๊ด€๊ณ„๋ณด๋‹ค ํฌ๊ท€์งˆํ™˜-ํŒจ์Šค์›จ์ด-์•ฝ๋ฌผ์ด ๋” ๋งŽ์€ ์‹ ์•ฝ์žฌ์ฐฝ์ถœ ๊ฐ€๋Šฅ์„ฑ์„ ๊ฐ€์ง„ ์ •๋ณด๋ผ๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋”ฐ๋ผ์„œ ํฌ๊ท€์งˆํ™˜ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, ๋‹ค๋ฅธ ์งˆ๋ณ‘์˜ ๊ด€๋ จ ์œ ์ „์ž ์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด๋ฅผ ์ถ”์ถœํ•˜๊ณ , ์ด๋ฅผ ์‹ ์•ฝ์žฌ์ฐฝ์ถœ ํ›„๋ณด๋ฅผ ์ •๋ ฌํ•˜์—ฌ ์•Œ ์ˆ˜ ์žˆ๋„๋ก ๊ธฐ๋Œ€ํ•ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค.Introduction: The purpose of this study is to utilize biological pathway data for rare disease drug repositioning. There are more than 7,000 rare diseases worldwide, but there is only treatment for 5% of these diseases. While there is a great need for treatments, traditional drug development is a very time consuming and costly process. For rare disease treatment, drug repositioning can potentially be a quicker and cheaper alternative. Biological pathway data describe the interaction between biological elements in detail and can be used to analyze gene data from a wider perspective. Therefore, it is hypothesized that they are suitable to use in drug repositioning. In this study, a common biological pathway list is generated from drug-related and rare disease-related gene data to find new drug candidates for rare diseases. Using the common pathway list and rare disease-drug similarity, a rare disease-drug candidate list is generated. Methods: 309 rare diseases from the Genomics England PanelApp is utilized with the relevant genes. 1,888 approved drugs and related genetic information is used from DrugBank. Using analysis tools provided by Reactome, biological pathways relevant to the rare disease-gene and drug-gene lists were collected based on FDR values. Among the collected biological pathways, there are 1,883 biological pathways commonly associated with the rare diseases and drugs, which are then used to calculate the similarity between the rare diseases and drugs. The Euclidean similarity of the rare diseases and drugs are calculated by vectorizing the FDR values. Results: Through this study, a rare disease-drug candidate list was generated. In the list, it can be interpreted that the smaller the value between a rare disease and drug is the more similar they are. Therefore, the more similar a rare disease and drug is, the more likely it is to be a candidate for rare disease drug repositioning. The results were compared with existing approved drugs used to treat rare diseases, for evaluation. Lomitapide is a drug used to treat Homozygous familial hypercholesterolemia. In the drug-rare disease list it has a similarity value of 2.8 with its PanelApp equivalent disease, which is rank 34 out of 309 rare diseases. The rare disease-pathway-drug results were also compared with the rare disease-gene-drug results with the drug, Thalidomide. In the rare disease-pathway-gene results, it is observed that Bladder cancer pertinent cancer susceptibility was the closest disease to Thalidomide, which coincides with recent literature. Discussion: From the results, it can be confirmed that the rare disease-drug list was relevant with existing rare disease treatments and that this relevance can also be measured. In addition, it is found that rare disease-pathway-drug associations are more applicable to drug repositioning than rare disease-gene-drug associations. Finally, it is believed that biological pathways can be used not just for rare diseases but also for finding drug repositioning candidates in common diseases.โ… . ์„œ ๋ก  1 1. ์—ฐ๊ตฌ์˜ ํ•„์š”์„ฑ 1 2. ์‹ ์•ฝ์žฌ์ฐฝ์ถœ 4 3. ์—ฐ๊ตฌ ๋ชฉ์  9 โ…ก. ์—ฐ๊ตฌ์žฌ๋ฃŒ ๋ฐ ๋ฐฉ๋ฒ• 10 1. ์—ฐ๊ตฌ ์žฌ๋ฃŒ 10 1) ํฌ๊ท€์งˆํ™˜ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค 10 2) ์•ฝ๋ฌผ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค 11 3) ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค 12 2. ๋ฐฉ๋ฒ• 15 1) ์ „์ฒ˜๋ฆฌ๊ณผ์ • 15 2) ํฌ๊ท€์งˆํ™˜์•ฝ๋ฌผ์˜ ๊ณตํ†ต ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด 21 3) ๊ฑฐ๋ฆฌ๊ณ„์‚ฐ๋ฒ• 23 โ…ข. ๊ฒฐ๋ก  27 1. ํฌ๊ท€์งˆํ™˜-์•ฝ๋ฌผ ์œ ์‚ฌ๋„ 27 2. ์‹ ์•ฝ์žฌ์ฐฝ์ถœ ์•ฝ๋ฌผ ํ›„๋ณด 28 โ…ฃ. ๊ณ ์ฐฐ 39 ์ฐธ๊ณ ๋ฌธํ—Œ 41 Abstract 45Maste

    Strategies for Managing Linked Enterprise Data

    Get PDF
    Data, information and knowledge become key assets of our 21st century economy. As a result, data and knowledge management become key tasks with regard to sustainable development and business success. Often, knowledge is not explicitly represented residing in the minds of people or scattered among a variety of data sources. Knowledge is inherently associated with semantics that conveys its meaning to a human or machine agent. The Linked Data concept facilitates the semantic integration of heterogeneous data sources. However, we still lack an effective knowledge integration strategy applicable to enterprise scenarios, which balances between large amounts of data stored in legacy information systems and data lakes as well as tailored domain specific ontologies that formally describe real-world concepts. In this thesis we investigate strategies for managing linked enterprise data analyzing how actionable knowledge can be derived from enterprise data leveraging knowledge graphs. Actionable knowledge provides valuable insights, supports decision makers with clear interpretable arguments, and keeps its inference processes explainable. The benefits of employing actionable knowledge and its coherent management strategy span from a holistic semantic representation layer of enterprise data, i.e., representing numerous data sources as one, consistent, and integrated knowledge source, to unified interaction mechanisms with other systems that are able to effectively and efficiently leverage such an actionable knowledge. Several challenges have to be addressed on different conceptual levels pursuing this goal, i.e., means for representing knowledge, semantic data integration of raw data sources and subsequent knowledge extraction, communication interfaces, and implementation. In order to tackle those challenges we present the concept of Enterprise Knowledge Graphs (EKGs), describe their characteristics and advantages compared to existing approaches. We study each challenge with regard to using EKGs and demonstrate their efficiency. In particular, EKGs are able to reduce the semantic data integration effort when processing large-scale heterogeneous datasets. Then, having built a consistent logical integration layer with heterogeneity behind the scenes, EKGs unify query processing and enable effective communication interfaces for other enterprise systems. The achieved results allow us to conclude that strategies for managing linked enterprise data based on EKGs exhibit reasonable performance, comply with enterprise requirements, and ensure integrated data and knowledge management throughout its life cycle

    A Framework for Semantic Similarity Measures to enhance Knowledge Graph Quality

    Get PDF
    Precisely determining similarity values among real-world entities becomes a building block for data driven tasks, e.g., ranking, relation discovery or integration. Semantic Web and Linked Data initiatives have promoted the publication of large semi-structured datasets in form of knowledge graphs. Knowledge graphs encode semantics that describes resources in terms of several aspects or resource characteristics, e.g., neighbors, class hierarchies or attributes. Existing similarity measures take into account these aspects in isolation, which may prevent them from delivering accurate similarity values. In this thesis, the relevant resource characteristics to determine accurately similarity values are identified and considered in a cumulative way in a framework of four similarity measures. Additionally, the impact of considering these resource characteristics during the computation of similarity values is analyzed in three data-driven tasks for the enhancement of knowledge graph quality. First, according to the identified resource characteristics, new similarity measures able to combine two or more of them are described. In total four similarity measures are presented in an evolutionary order. While the first three similarity measures, OnSim, IC-OnSim and GADES, combine the resource characteristics according to a human defined aggregation function, the last one, GARUM, makes use of a machine learning regression approach to determine the relevance of each resource characteristic during the computation of the similarity. Second, the suitability of each measure for real-time applications is studied by means of a theoretical and an empirical comparison. The theoretical comparison consists on a study of the worst case computational complexity of each similarity measure. The empirical comparison is based on the execution times of the different similarity measures in two third-party benchmarks involving the comparison of semantically annotated entities. Ultimately, the impact of the described similarity measures is shown in three data-driven tasks for the enhancement of knowledge graph quality: relation discovery, dataset integration and evolution analysis of annotation datasets. Empirical results show that relation discovery and dataset integration tasks obtain better results when considering semantics encoded in semantic similarity measures. Further, using semantic similarity measures in the evolution analysis tasks allows for defining new informative metrics able to give an overview of the evolution of the whole annotation set, instead of the individual annotations like state-of-the-art evolution analysis frameworks
    corecore