9 research outputs found

    High Performance Computing of Gene Regulatory Networks using a Message-Passing Model

    Full text link
    Gene regulatory network reconstruction is a fundamental problem in computational biology. We recently developed an algorithm, called PANDA (Passing Attributes Between Networks for Data Assimilation), that integrates multiple sources of 'omics data and estimates regulatory network models. This approach was initially implemented in the C++ programming language and has since been applied to a number of biological systems. In our current research we are beginning to expand the algorithm to incorporate larger and most diverse data-sets, to reconstruct networks that contain increasing numbers of elements, and to build not only single network models, but sets of networks. In order to accomplish these "Big Data" applications, it has become critical that we increase the computational efficiency of the PANDA implementation. In this paper we show how to recast PANDA's similarity equations as matrix operations. This allows us to implement a highly readable version of the algorithm using the MATLAB/Octave programming language. We find that the resulting M-code much shorter (103 compared to 1128 lines) and more easily modifiable for potential future applications. The new implementation also runs significantly faster, with increasing efficiency as the network models increase in size. Tests comparing the C-code and M-code versions of PANDA demonstrate that this speed-up is on the order of 20-80 times faster for networks of similar dimensions to those we find in current biological applications

    Estimating sample-specific regulatory networks

    Full text link
    Biological systems are driven by intricate interactions among the complex array of molecules that comprise the cell. Many methods have been developed to reconstruct network models of those interactions. These methods often draw on large numbers of samples with measured gene expression profiles to infer connections between genes (or gene products). The result is an aggregate network model representing a single estimate for the likelihood of each interaction, or "edge," in the network. While informative, aggregate models fail to capture the heterogeneity that is represented in any population. Here we propose a method to reverse engineer sample-specific networks from aggregate network models. We demonstrate the accuracy and applicability of our approach in several data sets, including simulated data, microarray expression data from synchronized yeast cells, and RNA-seq data collected from human lymphoblastoid cell lines. We show that these sample-specific networks can be used to study changes in network topology across time and to characterize shifts in gene regulation that may not be apparent in expression data. We believe the ability to generate sample-specific networks will greatly facilitate the application of network methods to the increasingly large, complex, and heterogeneous multi-omic data sets that are currently being generated, and ultimately support the emerging field of precision network medicine

    ๋ง์ดˆํ˜ˆ์•ก ๋‹จํ•ต๊ตฌ์—์„œ ์œ ์ „์ž ๋ฐœํ˜„์„ ํ†ตํ•œ ์ฒœ์‹ ๋ณ‘ํƒœ์ƒ๋ฆฌ์˜ ์ดํ•ด

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์˜๊ณผ๋Œ€ํ•™ ์˜ํ•™๊ณผ, 2021. 2. ๋ฐ•ํฅ์šฐ.์ฝ”๋ฅดํ‹ฐ์ฝ”์Šคํ…Œ๋กœ์ด๋“œ๋Š” ์ฒœ์‹ ์น˜๋ฃŒ์˜ ์ค‘์š”ํ•œ ์•ฝ์ œ์ด๋‹ค. ํ•˜์ง€๋งŒ ์ฝ”๋ฅดํ‹ฐ์ฝ”์Šคํ…Œ๋กœ์ด๋“œ์— ๋Œ€ํ•œ ์น˜๋ฃŒ ํšจ๊ณผ๋Š” ํ™˜์ž๋งˆ๋‹ค ์ƒ๋‹นํ•œ ์ฐจ์ด๊ฐ€ ์žˆ๋Š” ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์ ธ ์žˆ๋‹ค. ํŠนํžˆ ์ฝ”๋ฅดํ‹ฐ์ฝ”์Šคํ…Œ๋กœ์ด๋“œ์— ๋Œ€ํ•œ ๋‚ฎ์€ ๋ฐ˜์‘์„ฑ์€ ์ค‘์ฆ ์ฒœ์‹ ๋˜๋Š” ์žฆ์€ ๊ธ‰์„ฑ ์•…ํ™”์™€ ๊ด€๋ จ์ด ์žˆ์„ ์ˆ˜ ์žˆ๋‹ค. ๋น„๋ก ๋งŽ์€ ์œ ์ „์ฒด ์—ฐ๊ตฌ๋“ค์ด ์ง„ํ–‰๋˜์—ˆ์ง€๋งŒ, ์Šคํ…Œ๋กœ์ด๋“œ ์ €๋ฐ˜์‘์„ฑ๊ณผ ๊ด€๋ จ๋œ ์ฒœ์‹์˜ ๋ณ‘ํƒœ ์ƒ๋ฆฌ์— ๋Œ€ํ•ด์„œ๋Š” ์•„์ง๊นŒ์ง€ ์ถฉ๋ถ„ํžˆ ์—ฐ๊ตฌ๋˜์ง€ ์•Š์•˜๋‹ค. ๋”ฐ๋ผ์„œ ์ƒ์ฒด์™ธ ๋ฑ์‚ฌ๋ฉ”ํƒ€์† ์ฒ˜๋ฆฌ์— ๋”ฐ๋ฅธ ์œ ์ „์ž ๋ฐœํ˜„ ์–‘์ƒ์˜ ๋ณ€ํ™”๋ฅผ ๋ถ„์„ํ•˜๋Š” ๊ฒƒ์€ ์Šคํ…Œ๋กœ์ด๋“œ ์ €๋ฐ˜์‘์„ฑ๊ณผ ๊ด€๋ จ๋œ ์ฒœ์‹ ๊ธ‰์„ฑ์•…ํ™”์˜ ๊ธฐ์ „์„ ์—ฐ๊ตฌํ•˜๋Š”๋ฐ ๋„์›€์ด ๋  ์ˆ˜ ์žˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” ์ฒœ์‹ ํ™˜์ž์˜ ๋ง์ดˆ ํ˜ˆ์•ก ๋‹จํ•ต๊ตฌ ์„ธํฌ์˜ ์œ ์ „์ž ๋ฐœํ˜„ ์–‘์ƒ ๋ฐ ์ƒ์ฒด์™ธ ๋ฑ์‚ฌ๋ฉ”ํƒ€์† ์ฒ˜๋ฆฌ์— ๋”ฐ๋ฅธ ๋ณ€ํ™”๋ฅผ ๋ถ„์„ํ•จ์œผ๋กœ์จ ์Šคํ…Œ๋กœ์ด๋“œ ์ €๋ฐ˜์‘์„ฑ๊ณผ ๊ด€๋ จ๋œ ๋ณ‘ํƒœ์ƒ๋ฆฌ ๊ธฐ์ „ ๋ฐ ์ƒ๋ฌผํ•™์  ๊ฒฝ๋กœ๋ฅผ ํƒ์ƒ‰ํ•ด๋ณด๊ณ ์ž ํ•œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” ๋‘ ํŒŒํŠธ๋กœ ๋‚˜๋ˆ„์–ด ์ง„ํ–‰๋˜์—ˆ๋‹ค. ์ฒซ๋ฒˆ์งธ ํŒŒํŠธ๋Š” Weighted Gene Co-expression Network Analysis (WGCNA) ๋ฐฉ๋ฒ•๋ก ์„ ํ†ตํ•ด, ์†Œ์•„์ฒœ์‹ ํ™˜์ž์™€ ์„ฑ์ธ์ฒœ์‹ ํ™˜์ž์—์„œ ๊ณตํ†ต์ ์œผ๋กœ ๊ด€์ฐฐ๋˜๋Š” ๊ธ‰์„ฑ์•…ํ™”์™€ ๊ด€๋ จ๋œ ์œ ์ „์ฒด ๋ชจ๋“ˆ์ด ์กด์žฌํ•˜๋Š”์ง€ ๊ทธ๋ฆฌ๊ณ  ํ•ด๋‹น ๋ชจ๋“ˆ์— ์ƒ์ฒด์™ธ ๋ฑ์‚ฌ๋ฉ”ํƒ€์† ์ฒ˜๋ฆฌ๋ฅผ ์œ ์ „์ž ๋ฐœํ˜„ ์–‘์ƒ์˜ ๋ณ€ํ™”๋ฅผ ํƒ์ƒ‰ํ•˜์˜€๋‹ค. ์†Œ์•„ ์ฒœ์‹ ํ™˜์ž 107๋ช…์˜ ๋ถˆ๋ฉธํ™”๋œ ๋ฆผํ”„๋ชจ์„ธํฌ ์„ธํฌ์ฃผ์™€ ์„ฑ์ธ์ฒœ์‹ ํ™˜์ž 29๋ช…์˜ ๋ง์ดˆํ˜ˆ์•ก ๋‹จํ•ต๊ตฌ์—์„œ ์œ ์ „์ž ๋ฐœํ˜„ ์–‘์ƒ์„ ๋ถ„์„ํ•˜์˜€๋‹ค. ์ฒœ์‹ ๊ธ‰์„ฑ์•…ํ™”๋Š” ์ „์‹ ์Šคํ…Œ๋กœ์ด๋“œ๋ฅผ 3์ผ์ด์ƒ ๋ณต์šฉํ•˜๊ฑฐ๋‚˜ ์ฒœ์‹์œผ๋กœ ์ธํ•ด ์‘๊ธ‰ ๋ฐฉ๋ฌธ ๋˜๋Š” ์ž…์›์œผ๋กœ ์ •์˜ํ•˜์˜€๋‹ค. ์†Œ์•„์ฒœ์‹ ํ™˜์ž๊ตฐ๊ณผ ์„ฑ์ธ์ฒœ์‹ ํ™˜์ž๊ตฐ์—์„œ ๊ณตํ†ต์ ์œผ๋กœ ๊ด€์ฐฐ๋˜๋Š” ์ด 77๊ฐœ์˜ ์œ ์ „์ž๋กœ ๊ตฌ์„ฑ๋œ ๊ธ‰์„ฑ์•…ํ™”๊ณผ ๊ด€๋ จ๋œ ์œ ์ „์ฒด ๋ชจ๋“ˆ์„ ์ฐพ์•˜๋‹ค. ํ•ด๋‹น ๋ชจ๋“ˆ์˜ EIF2AK2 ์œ ์ „์ฒด์™€ NOL11 ์œ ์ „์ฒด๋Š” ๋ฑ์‚ฌ๋ฉ”ํƒ€์† ์ฒ˜๋ฆฌ์‹œ ์†Œ์•„ ์ฒœ์‹ํ™˜์ž๊ตฐ๊ณผ ์„ฑ์ธ์ฒœ์‹ ํ™˜์ž๊ตฐ ๋ชจ๋‘์—์„œ ์œ ์ „์ฒด ๋ฐœํ˜„์–‘์ด ์œ ์˜ํ•˜๊ฒŒ ๊ฐ์†Œํ•˜์˜€๋‹ค. ํ•ด๋‹น ๋ชจ๋“ˆ ์ค‘ 64๊ฐœ์˜ ์œ ์ „์ฒด๋Š” ๋ฑ์‚ฌ๋ฉ”ํƒ€์† ์ฒ˜๋ฆฌ์‹œ ์œ ์ „์ž ๋ฐœํ˜„์–‘์ด ์œ ์˜ํ•˜๊ฒŒ ๋ณ€ํ•˜์ง€ ์•Š์•˜๋Š”๋ฐ, ์ด๋“ค ์œ ์ „์ž๋“ค์€ ๋‹จ๋ฐฑ์งˆ ์ˆ˜๋ฆฌ ๊ฒฝ๋กœ (protein repair pathway) ๋“ฑ๊ณผ ๊ด€๋ จ์„ฑ์ด ์žˆ์—ˆ๋‹ค. ๋‹จ๋ฐฑ์งˆ ์ˆ˜๋ฆฌ ๊ฒฝ๋กœ์™€ ๊ด€๋ จ๋œ ์œ ์ „์ž ์ค‘์—์„œ MSRA์™€ MSRB2์˜ ์ค‘์š”ํ•œ ์—ญํ• ์€ ์‚ฐํ™” ์ŠคํŠธ๋ ˆ์Šค๋ฅผ ์กฐ์ ˆํ•˜๋Š” ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์ ธ ์žˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์˜ ๋‘๋ฒˆ์งธ ํŒŒํŠธ๋Š” ์œ ์ „์ž ์กฐ์ ˆ ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•ด ์„ฑ์ธ ์ฒœ์‹ํ™˜์ž์—์„œ ํก์ž…์šฉ ์Šคํ…Œ๋กœ์ด๋“œ์— ๋Œ€ํ•œ ๋ฐ˜์‘์„ฑ์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” ์š”์ธ๋“ค์„ ํƒ์ƒ‰ํ•˜์˜€๋‹ค. ํก์ž…์šฉ ์Šคํ…Œ๋กœ์ด๋“œ์— ๋Œ€ํ•œ ์น˜๋ฃŒ ํšจ๊ณผ๊ฐ€ ์žˆ์—ˆ๋˜ ํ™˜์ž๊ตฐ๊ณผ ์น˜๋ฃŒ ํšจ๊ณผ๊ฐ€ ์—†์—ˆ๋˜ ํ™˜์ž๊ตฐ์—์„œ ์ƒ์ฒด์™ธ ๋ฑ์‚ฌ๋ฉ”ํƒ€์† ์ฒ˜๋ฆฌ์‹œ ์ „์‚ฌ์ธ์ž ์ฐจ๋ณ„๋ฐœํ˜„์„ ๋ณด์˜€๋˜ ์ƒ์œ„ 5๊ฐœ์˜ ์ „์‚ฌ์ธ์žํšจ๊ณผ๋Š” GATA1, JUN, NFKB1, SPl1 ๊ทธ๋ฆฌ๊ณ  RELA์˜€๋‹ค. ์ด๋“ค ์ „์‚ฌ์ธ์ž๋Š” ํก์ž…์šฉ ์Šคํ…Œ๋กœ์ด๋“œ์— ๋ฐ˜์‘์ด ์žˆ์—ˆ๋˜ ํ™˜์ž๊ตฐ๊ณผ ์—†์—ˆ๋˜ ํ™˜์ž๊ตฐ์—์„œ ์„œ๋กœ ๋‹ค๋ฅธ ์œ ์ „์ž๋“ค๊ณผ ๋‹ค์–‘ํ•œ ์ƒ๋ฌผํ•™์  ๊ฒฝ๋กœ์—์„œ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ์—ˆ๋‹ค. TBX4 ์œ ์ „์ž๋Š” ํก์ž…์šฉ ์Šคํ…Œ๋กœ์ด๋“œ์— ์ข‹์€ ์น˜๋ฃŒํšจ๊ณผ๋ฅผ ๋ณด์˜€๋˜ ํ™˜์ž๊ตฐ์—์„œ ์—ผ์ฆ๋ฐ˜์‘๊ณผ ๊ด€๋ จ๋œ NFKB1 ์ „์‚ฌ์ธ์ž์™€ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ์—ˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋ฅผ ํ†ตํ•ด ๊ทœ๋ช…๋œ ์ƒˆ๋กœ์šด ์œ ์ „์ž ๋ฐ ์ƒ๋ฌผํ•™์  ๊ฒฝ๋กœ ํƒ์ƒ‰์„ ํ†ตํ•ด ์Šคํ…Œ๋กœ์ด๋“œ ์ €๋ฐ˜์‘์„ฑ๊ณผ ๊ด€๋ จ๋œ ์œ ์ „์  ํŠน์งˆ์„ ์ดํ•ดํ•˜๋Š”๋ฐ ๋„์›€์ด ๋˜์—ˆ๊ณ , ์ด๋Š” ์ฒœ์‹์˜ ๋‹ค์–‘ํ•œ ๋ณ‘ํƒœ์ƒ๋ฆฌ์— ๊ธฐ๋ฐ˜ํ•œ ์ƒˆ๋กœ์šด ์น˜๋ฃŒ์ œ ๋˜๋Š” ์ƒ๋ฌผ์ง€ํ‘œ๋ฅผ ๊ฐœ๋ฐœํ•˜๋Š”๋ฐ ์ด๋ฐ”์ง€ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.Asthma is a chronic inflammatory airway disease characterized by bronchial hyperresponsiveness and reversible airway obstruction. Corticosteroids are known to the most effective treatment for asthma. However, there is substantial variability in response to corticosteroids in asthma patients. Ineffective response to corticosteroids may result in exacerbation of asthma. Although many genetic studies have been conducted, the mechanisms of asthma pathogenesis and steroid insensitivity in asthma have not been fully elucidated. Gene expression profile represents the complete set of RNA transcripts that are produced by the genome under specific circumstances or in a specific cell. High-throughput methods such as microarray, and recent advances in biostatistics based on network-based approaches provide a quick and effective way of identifying novel genes and pathways related to asthma. This study aimed to understand the pathogenesis and steroid insensitivity in asthma using gene expression profiles of blood cells from asthma patients. To obtain a comprehensive picture of the gene expression in these cells, we used network-based approaches. The study was divided into two separate parts. In the first part of the study, important genetic signatures of acute exacerbation (AE) in asthma were identified using weighted gene co-expression network analysis (WGCNA) in peripheral blood mononuclear cells (PBMCs) from 29 adult asthma patients and lymphoblastoid cell lines (LCLs) from 107 childhood asthma patients. An AE-associated gene module composed of 77 genes was identified from childhood asthma patients and the conservation of this gene module structure was validated in adult asthma patients. The identified module was found to be conserved in terms of the gene expression profile and associated with AE in both childhood and adult asthma patients, and thus it was defined as an AE-associated common gene module. Changes in the expression of genes in the AE-associated common gene module following in vitro dexamethasone (Dex) treatment were examined, to better understand the mechanisms associated with steroid insensitivity. The differential gene expression profiles were classified into two classes according to Dex-induced changes in childhood asthma patients. Thirteen genes showed significant Dex-induced differential expression and were categorized as the A gene set. Sixty-four genes were not significantly altered by Dex were categorized as the B gene set. In the A gene set, the expression of eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2) showed significant Dex-induced differential expression in adult asthma patients as well. In addition, the basal expression of EIF2AK2 (pre-Dex) were significantly higher in asthma patients with AE compared to those without AE in both childhood and adult asthma. In the B gene set, based on a pathway-based approach, the protein repair pathway was found to be significantly enriched. Among the genes that belong to this pathway, the basal expression of methionine sulfoxide reductase A (MSRA) and methionine sulfoxide reductase B2 (MSRB2) were significantly lower in asthma patients with AE compared to those without AE in both childhood and adult asthma. These findings suggest that alternate treatment options, apart from corticosteroids, may be needed to prevent AE in asthma. Expression of EIF2AK2, MSRA, and MSRB2 in blood cells may help us to identify AE-susceptible asthma patients and adjust treatments to prevent AE events. In the second study, gene regulatory networks identified gene expression profiles of PBMCs from 23 adult asthma patients were assessed to elucidate the differences in responsiveness to inhaled corticosteroids (ICSs). Among these the top five (top-5) transcriptional factors (TFs; Top-5 TFs: GATA1, JUN, NFฮบB1, SPl1, and RELA) showing differential connections between good-responders (GRs) and poor-responders (PRs) were identified. Interestingly, GATA1 and JUN also showed differential connections in the gene regulatory networks identified gene expression profiles of LCLs from 107 childhood asthma patients in a previous study. The top-5 TFs and their connected genes were significantly enriched in distinct biological pathways associated with asthma. Among the genes connected to the top-5 TFs, the expression of TBX4, which is regulated by the TF, NFฮบB1, may be helpful in identifying GRs to ICS treatment. In conclusion, the novel genes and biological pathways identified in this study may deepen our understanding of asthma pathophysiology and steroid insensitivity in asthma.Table of Contents Chapter 1 Introduction 1 Chapter 2 Part I 10 2.1 Introduction 10 2.2 Methods 12 2.3 Results 23 2.4 Discussion 43 Chapter 3 Part II 48 3.1 Introduction 48 3.2 Methods 50 3.3 Results 53 3.4 Discussion 79 Chapter 4 Conclusions 84 Bibliography 86 Abstract in korean 101 List of Tables Table 1 26 Table 2 27 Table 3 29 Table 4 34 Table 5 55 Table 6 56 Table 7 57 Table 8 59 List of Figures Figure 1 14 Figure 2 35 Figure 3 36 Figure 4 37 Figure 5 38 Figure 6 39 Figure 7 40 Figure 8 41 Figure 9 42 Figure 10 77 Figure 11 78Docto

    Network Medicine in the Age of Biomedical Big Data

    Get PDF
    Network medicine is an emerging area of research dealing with molecular and genetic interactions, network biomarkers of disease, and therapeutic target discovery. Large-scale biomedical data generation offers a unique opportunity to assess the effect and impact of cellular heterogeneity and environmental perturbations on the observed phenotype. Marrying the two, network medicine with biomedical data provides a framework to build meaningful models and extract impactful results at a network level. In this review, we survey existing network types and biomedical data sources. More importantly, we delve into ways in which the network medicine approach, aided by phenotype-specific biomedical data, can be gainfully applied. We provide three paradigms, mainly dealing with three major biological network archetypes: protein-protein interaction, expression-based, and gene regulatory networks. For each of these paradigms, we discuss a broad overview of philosophies under which various network methods work. We also provide a few examples in each paradigm as a test case of its successful application. Finally, we delineate several opportunities and challenges in the field of network medicine. We hope this review provides a lexicon for researchers from biological sciences and network theory to come on the same page to work on research areas that require interdisciplinary expertise. Taken together, the understanding gained from combining biomedical data with networks can be useful for characterizing disease etiologies and identifying therapeutic targets, which, in turn, will lead to better preventive medicine with translational impact on personalized healthcare
    corecore