8,875 research outputs found

    Supporting cognition in systems biology analysis: findings on users' processes and design implications

    Get PDF
    Abstract Background Current usability studies of bioinformatics tools suggest that tools for exploratory analysis support some tasks related to finding relationships of interest but not the deep causal insights necessary for formulating plausible and credible hypotheses. To better understand design requirements for gaining these causal insights in systems biology analyses a longitudinal field study of 15 biomedical researchers was conducted. Researchers interacted with the same protein-protein interaction tools to discover possible disease mechanisms for further experimentation. Results Findings reveal patterns in scientists' exploratory and explanatory analysis and reveal that tools positively supported a number of well-structured query and analysis tasks. But for several of scientists' more complex, higher order ways of knowing and reasoning the tools did not offer adequate support. Results show that for a better fit with scientists' cognition for exploratory analysis systems biology tools need to better match scientists' processes for validating, for making a transition from classification to model-based reasoning, and for engaging in causal mental modelling. Conclusion As the next great frontier in bioinformatics usability, tool designs for exploratory systems biology analysis need to move beyond the successes already achieved in supporting formulaic query and analysis tasks and now reduce current mismatches with several of scientists' higher order analytical practices. The implications of results for tool designs are discussed.http://deepblue.lib.umich.edu/bitstream/2027.42/134554/1/13009_2008_Article_29.pd

    The genome of Romanomermis culicivorax:revealing fundamental changes in the core developmental genetic toolkit in Nematoda

    Get PDF
    Background: The genetics of development in the nematode Caenorhabditis elegans has been described in exquisite detail. The phylum Nematoda has two classes: Chromadorea (which includes C. elegans) and the Enoplea. While the development of many chromadorean species resembles closely that of C. elegans, enoplean nematodes show markedly different patterns of early cell division and cell fate assignment. Embryogenesis of the enoplean Romanomermis culicivorax has been studied in detail, but the genetic circuitry underpinning development in this species has not been explored. Results: We generated a draft genome for R. culicivorax and compared its gene content with that of C. elegans, a second enoplean, the vertebrate parasite Trichinella spiralis, and a representative arthropod, Tribolium castaneum. This comparison revealed that R. culicivorax has retained components of the conserved ecdysozoan developmental gene toolkit lost in C. elegans. T. spiralis has independently lost even more of this toolkit than has C. elegans. However, the C. elegans toolkit is not simply depauperate, as many novel genes essential for embryogenesis in C. elegans are not found in, or have only extremely divergent homologues in R. culicivorax and T. spiralis. Our data imply fundamental differences in the genetic programmes not only for early cell specification but also others such as vulva formation and sex determination. Conclusions: Despite the apparent morphological conservatism, major differences in the molecular logic of development have evolved within the phylum Nematoda. R. culicivorax serves as a tractable system to contrast C. elegans and understand how divergent genomic and thus regulatory backgrounds nevertheless generate a conserved phenotype. The R. culicivorax draft genome will promote use of this species as a research model

    Scientistsโ€™ sense making when hypothesizing about disease mechanisms from expression data and their needs for visualization support

    Get PDF
    Abstract A common class of biomedical analysis is to explore expression data from high throughput experiments for the purpose of uncovering functional relationships that can lead to a hypothesis about mechanisms of a disease. We call this analysis expression driven, -omics hypothesizing. In it, scientists use interactive data visualizations and read deeply in the research literature. Little is known, however, about the actual flow of reasoning and behaviors (sense making) that scientists enact in this analysis, end-to-end. Understanding this flow is important because if bioinformatics tools are to be truly useful they must support it. Sense making models of visual analytics in other domains have been developed and used to inform the design of useful and usable tools. We believe they would be helpful in bioinformatics. To characterize the sense making involved in expression-driven, -omics hypothesizing, we conducted an in-depth observational study of one scientist as she engaged in this analysis over six months. From findings, we abstracted a preliminary sense making model. Here we describe its stages and suggest guidelines for developing visualization tools that we derived from this case. A single case cannot be generalized. But we offer our findings, sense making model and case-based tool guidelines as a first step toward increasing interest and further research in the bioinformatics field on scientistsโ€™ analytical workflows and their implications for tool design.http://deepblue.lib.umich.edu/bitstream/2027.42/109495/1/12859_2012_Article_6377.pd

    A cognitive task analysis of a visual analytic workflow: Exploring molecular interaction networks in systems biology

    Get PDF
    Background: Bioinformatics visualization tools are often not robust enough to support biomedical specialistsโ€™ complex exploratory analyses. Tools need to accommodate the workflows that scientists actually perform for specific translational research questions. To understand and model one of these workflows, we conducted a case-based, cognitive task analysis of a biomedical specialistโ€™s exploratory workflow for the question: What functional interactions among gene products of high throughput expression data suggest previously unknown mechanisms of a disease? Results: From our cognitive task analysis four complementary representations of the targeted workflow were developed. They include: usage scenarios, flow diagrams, a cognitive task taxonomy, and a mapping between cognitive tasks and user-centered visualization requirements. The representations capture the flows of cognitive tasks that led a biomedical specialist to inferences critical to hypothesizing. We created representations at levels of detail that could strategically guide visualization development, and we confirmed this by making a trial prototype based on user requirements for a small portion of the workflow. Conclusions: Our results imply that visualizations should make available to scientific users โ€œbundles of featuresโ€ consonant with the compositional cognitive tasks purposefully enacted at specific points in the workflow. We also highlight certain aspects of visualizations that: (a) need more built-in flexibility; (b) are critical for negotiating meaning; and (c) are necessary for essential metacognitive support

    ์ƒ๋ฌผํ•™์  ๋„คํŠธ์›Œํฌ๋ฅผ ์ด์šฉํ•˜์—ฌ ์œ ์ „์ž๋กœ๋ถ€ํ„ฐ ํŒจ์Šค์›จ์ด, ํ‘œํ˜„ํ˜•๊นŒ์ง€์˜ ์ „์‚ฌ์ฒด ๊ณต๊ฐ„์„ ํƒ์ƒ‰ํ•˜๋Š” ์ •๋ณดํ•™ ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ์ƒ๋ฌผ์ •๋ณดํ•™์ „๊ณต,2019. 8. ๊น€์„ .Transcriptome data, genome-wide measurement of transcripts, has been used to increase our understandings of biological processes at transcription level significantly. Analysis of transcriptome data involves a series of steps from identification of differentially expressed genes (DEGs) to pathway enrichment analysis to association with phenotypes. There exist several hurdles at each step that need to be addressed with state of the art bioinformatics techniques. For example, the complex nature of living organisms can be represented as a network where the nodes are the interacting entities such as genes or pathways and the edges are the interactions between the nodes. Network analysis is crucial in that it can reveal the hidden associations between transcriptome data and phenotypes. In addition, network propagation has emerged as a technique to measure the influential power of nodes in a network. Network propagation has demonstrated its utility on biological context by many studies and has been contributing to invaluable discoveries in biological and medical science fields. In my doctoral study, I explored and analyzed trasncriptome at various levels using machine learning, network information and network propagation techniques. My thesis consists of three studies. The first study was to develop an accurate and stable method for determining differentially expressed genes using machine learning techniques. The second study was to develop a novel method to investigate interactions among biological pathways using explicit gene expression information from RNA-seq. The last study was to perform analysis of xenotransplant transcriptome data using various methods including the network propagation technique. In the first study, MLDEG, a machine learning approach to identify DEGs using network property and network propagation, was developed. Currently available DEG detection methods have widely been used and contributed to new biological discoveries. Most of the methods use their own models to define DEGs. However, because the traits of transcriptome data vary significantly depending on the experimental designs and sequencing technologies, a single model can hardly fit all transcriptome data of different traits. In addition, setting cutoff values of p-values and fold change is arbitrary. Thus, the results yielded by the methods are often inconsistent and heterogeneous. MLDEG addresses these issues by building a model that uses network information and network propagation results as features. The goal of MLDEG is to train a model by using network-based features extracted from more likely true and false DEGs and use the model to classify DEGs from the genes that cannot be clearly defined as DEGs by existing methods. Tested on 10 high-throughput RNA-seq data, MLDEG showed better performances than the competing methods. In the second study, I developed a Pathway INTeraction network construction method (PINTnet) that can construct a condition-specific pathway interaction network by computing shortest paths on protein-protein interaction (PPI) networks. Because pathways usually function in a coordinated and cooperative fashion, understanding interactions, or crosstalks, between pathways becomes as important as identifying perturbed single pathway. However, existing methods do not take into account the topological features, treating the pathways just as a set of genes. To solve the problem, PINTnet computes shortest paths on PPI networks mapped to each pair of pathways and creates subnetworks using the shortest paths. It then measures the activation status of pathway interaction using the product of closeness centrality and explicit gene expression quantity. The performance of PINTnet was evaluated using three high-throughput RNAseq data and successfully reproduced the findings in the original papers of the data. In the last study, I participated in a xenotransplantation study to elucidate the cause of chronic phase islet graft loss. Clinical islet transplantation is one of the promising options for type 1 diabetes but long-term outcome of graft function is not yet satisfactory. To reveal the mechanism of the graft loss in chronic phase, I carried out pathway interaction network analysis using PINTnet on a time-series porcine islet-transplanted rhesus monkey RNA-seq data and identified the activation of T cell receptor signaling pathway. The analysis results were supported by the biopsy result of liver sample that CD3+ T cell heavily infiltrated the porcine islet. Additionally, I carried out gene prioritization using network propagation to verify five graft loss-relevant scenarios. The result suggested that T cell-mediated long-term graft loss was the most probable scenario. In summary, my doctoral study used network information, network property, and network propagation to identify DEGs and predict pathway interactions. In addition, I participated in a xenotransplantation research and carried out pathway interaction network analysis and network propagation to reveal the possible cause of chronic phase islet graft loss. Utilizing network information and network propagation was very effective to discover the relationships among biological entities and analyze the complex phenotypes.์ „์‚ฌ ๊ณผ์ •์—์„œ์˜ ์ƒ๋ฌผํ•™์  ํ”„๋กœ์„ธ์Šค์— ๋Œ€ํ•œ ์ดํ•ด๋ฅผ ๋†’์ด๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์ „์‚ฌ์ฒด ๋ฐ์ดํ„ฐ์˜ ๋ถ„์„์€ ์ฐจ๋ณ„ ๋ฐœํ˜„ ์œ ์ „์ž๋ฅผ ์ฐพ์•„๋‚ด๋Š” ๊ฒƒ์—์„œ๋ถ€ํ„ฐ ํ‘œํ˜„ํ˜•์— ์—ฐ๊ด€๋œ ํŒจ์Šค ์›จ์ด ์ฆํญ ๋ถ„์„๊นŒ์ง€์˜ ์ผ๋ จ์˜ ๋‹จ๊ณ„๋ฅผ ํฌํ•จํ•œ๋‹ค. ๊ฐ ๋‹จ๊ณ„๋งˆ๋‹ค, ๋„˜์–ด์•ผ ํ•  ์žฅ์• ๋ฌผ๋“ค์ด ์กด์žฌํ•˜๋ฉฐ ์ด๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ์ƒ๋ฌผ์ •๋ณดํ•™ ๊ธฐ์ˆ ์˜ ๊ฐœ๋ฐœ์€ ํ•„์ˆ˜์ ์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ƒ๋ช…์ฒด์˜ ๋ณต์žกํ•œ ํŠน์„ฑ์€ ์œ ์ „์ž ๋˜๋Š” ํŒจ์Šค์›จ์ด๊ฐ€ ๋…ธ๋“œ, ๊ทธ ๊ฐœ์ฒด ์‚ฌ์ด์˜ ์ƒํ˜ธ ์ž‘์šฉ์ด ์—ฃ์ง€์ธ ๋„คํŠธ์›Œํฌ๋กœ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋‹ค. ์ด ๋•Œ, ๋„คํŠธ์›Œํฌ ๋ถ„์„ ๊ธฐ๋ฒ•์€ ์ „์‚ฌ์ฒด ๋ฐ์ดํ„ฐ์™€ ํ‘œํ˜„ํ˜• ๊ฐ„์˜ ์ˆจ๊ฒจ์ง„ ์—ฐ๊ด€์„ฑ์„ ์ฐพ๋Š” ๋ฐ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•  ์ˆ˜ ์žˆ๋‹ค. ํ•œ ํŽธ, ๋„คํŠธ์›Œํฌ ์ „ํŒŒ๋Š” ๋„คํŠธ์›Œํฌ์—์„œ ๋…ธ๋“œ์˜ ์˜ํ–ฅ๋ ฅ์„ ์ธก์ •ํ•˜๋Š” ๊ธฐ์ˆ ๋กœ ์ฃผ๋ชฉ๋ฐ›๊ณ  ์žˆ์œผ๋ฉฐ ์ƒˆ๋กœ์šด ์ƒ๋ฌผํ•™์  ๋ฐœ๊ฒฌ์— ๊ธฐ์—ฌํ•˜๋Š” ๋“ฑ, ์ƒ๋ฌผํ•™ ๋ฐ ์˜ํ•™ ๋ถ„์•ผ์˜ ๋งŽ์€ ์—ฐ๊ตฌ์—์„œ ๊ทธ ์œ ์šฉ์„ฑ์„ ์ž…์ฆํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋Ÿฌํ•œ ๊ธฐ๊ณ„ ํ•™์Šต, ๋„คํŠธ์›Œํฌ ์ •๋ณด ๋ฐ ๋„คํŠธ์›Œํฌ ์ „ํŒŒ๋ฅผ ์ด์šฉํ•œ ์ „์‚ฌ์ฒด ๋ฐ์ดํ„ฐ ๋ถ„์„์— ๊ด€ํ•œ ์—ฐ๊ตฌ์— ๋Œ€ํ•ด ๋‹ค๋ฃฌ๋‹ค. ์ฒซ ๋ฒˆ์งธ ์—ฐ๊ตฌ์—์„œ๋Š”, ๋„คํŠธ์›Œํฌ ์ •๋ณด์™€ ๋„คํŠธ์›Œํฌ ์ „ํŒŒ๋ฅผ ์ด์šฉํ•˜์—ฌ ์ฐจ๋ณ„ ๋ฐœํ˜„ ์œ ์ „์ž๋ฅผ ์‹๋ณ„ํ•˜๋Š” ๊ธฐ๊ณ„ ํ•™์Šต ์ ‘๊ทผ๋ฒ•(MLDEG)์— ๊ด€ํ•œ ์—ฐ๊ตฌ๋ฅผ ๋‹ค๋ฃฌ๋‹ค. ์ฐจ๋ณ„ ๋ฐœํ˜„ ์œ ์ „์ž ๋ถ„์„์€ ์ƒ๋ฌผํ•™ ์—ฐ๊ตฌ์—์„œ ์ƒˆ๋กœ์šด ์ƒ๋ฌผํ•™์  ์ง€์‹์˜ ๋ฐœ๊ฒฌ์— ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•˜๊ณ  ์žˆ์œผ๋‚˜ ์ด๋ฅผ ์œ„ํ•œ ๊ธฐ์กด์˜ ๋ถ„์„ ๋„๊ตฌ๋“ค์ด ๋„์ถœํ•˜๋Š” ๊ฒฐ๊ณผ๋Š” ๊ฐ๊ธฐ ๋‹ค๋ฅด๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋„คํŠธ์›Œํฌ ์ •๋ณด ๋ฐ ๋„คํŠธ์›Œํฌ ์ „ํŒŒ ๊ฒฐ๊ณผ๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•˜์—ฌ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์˜€๋‹ค. ๋ณธ ์—ฐ๊ตฌ์˜ ๋ชฉํ‘œ๋Š” ์ฐจ๋ณ„ ๋ฐœํ˜„ ์œ ์ „์ž ๋ฐ ๋น„์ฐจ๋ณ„ ๋ฐœํ˜„ ์œ ์ „์ž๋กœ์„œ ๊ฐ€์žฅ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋Š” ์œ ์ „์ž๋ฅผ ์„ ์ •ํ•˜์—ฌ ๋„คํŠธ์›Œํฌ ๊ธฐ๋ฐ˜ ํŠน์ง•์„ ์ถ”์ถœํ•˜๊ณ  ์ด ํŠน์ง•์„ ๋ฐ”ํƒ•์œผ๋กœ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜์—ฌ ์ฐจ๋ณ„ ๋ฐœํ˜„ ์œ ์ „์ž๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์—ด๊ฐœ์˜ RNA-seq ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•˜์—ฌ ๊ฒ€์ฆํ•œ ๊ฒฐ๊ณผ, ๊ธฐ์กด์˜ ๋ถ„์„ ๋„๊ตฌ๋“ค๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ž„์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋‘ ๋ฒˆ์งธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋‹จ๋ฐฑ์งˆ ์ƒํ˜ธ ์ž‘์šฉ ๋„คํŠธ์›Œํฌ์ƒ์˜ ์ตœ๋‹จ ๊ฒฝ๋กœ๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ํŠน์ • ์‹คํ—˜ ์กฐ๊ฑดํ•˜์—์„œ ํŒจ์Šค์›จ์ด ์ƒํ˜ธ ์ž‘์šฉ ๋„คํŠธ์›Œํฌ๋ฅผ ๊ตฌ์ถ•ํ•  ์ˆ˜ ์žˆ๋Š” ํŒจ์Šค์›จ์ด ์ƒํ˜ธ ์ž‘์šฉ ๋„คํŠธ์›Œํฌ ๊ตฌ์ถ• ๋ฐฉ๋ฒ•(PINTnet)์— ๋Œ€ํ•œ ๋‚ด์šฉ์„ ๋‹ค๋ฃฌ๋‹ค. ๊ธฐ์กด์˜ ๋ฐฉ๋ฒ•๋“ค์€ ์œ ์ „์ž ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๊ณ ๋ คํ•˜์ง€ ์•Š๊ณ  ํŒจ์Šค์›จ์ด๋ฅผ ๋‹จ์ˆœํžˆ ์œ ์ „์ž์˜ ์ง‘ํ•ฉ์œผ๋กœ๋งŒ ๋‹ค๋ฃจ๋Š” ๋ฌธ์ œ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์œ ์ „์ž ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ๊ฐ ํŒจ์Šค์›จ์ด ์Œ์— ๋งคํ•‘๋œ ๋‹จ๋ฐฑ์งˆ ์ƒํ˜ธ์ž‘์šฉ ๋„คํŠธ์›Œํฌ์—์„œ ์ตœ๋‹จ ๊ฒฝ๋กœ๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ด ๋งŒ๋“ค์–ด์ง„ ์„œ๋ธŒ๋„คํŠธ์›Œํฌ์—์„œ ๊ทผ์ ‘์ค‘์‹ฌ์„ฑ๊ณผ ์œ ์ „์ž ๋ฐœํ˜„๋Ÿ‰์˜ ๊ณฑ์„ ๋ฐ”ํƒ•์œผ๋กœ ํŒจ์Šค์›จ์ด ์ƒํ˜ธ์ž‘์šฉ์˜ ํ™œ์„ฑํ™” ์ƒํƒœ๋ฅผ ์ธก์ •ํ•จ์œผ๋กœ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์˜€๋‹ค. ์„ธ ๊ฐœ์˜ RNA-seq ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•˜์—ฌ PINTnet์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•œ ๊ฒฐ๊ณผ, ๊ฐ ๋ฐ์ดํ„ฐ์˜ ์› ๋…ผ๋ฌธ์—์„œ ์ฃผ์žฅํ•œ ๊ฒฐ๊ณผ๋ฅผ ์„ฑ๊ณต์ ์œผ๋กœ ์žฌํ˜„ํ•จ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰ ์—ฐ๊ตฌ๋Š” ๋งŒ์„ฑ ์ทŒ๋„ ์ด์‹ํŽธ ์†Œ์‹ค์˜ ์›์ธ์„ ๋ฐํžˆ๊ธฐ ์œ„ํ•œ ์ด์ข…์žฅ๊ธฐ์ด์‹ ๋ฐ์ดํ„ฐ ๋ถ„์„์— ๊ด€ํ•œ ๋‚ด์šฉ์„ ๋‹ค๋ฃฌ๋‹ค. ๋งŒ์„ฑ ๋‹จ๊ณ„์—์„œ์˜ ์ด์‹ํŽธ ์†Œ์‹ค์˜ ๊ธฐ์ž‘์„ ๋ฐํžˆ๊ธฐ ์œ„ํ•ด, PINTnet์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ผ์ง€ ์ทŒ๋„๊ฐ€ ์ด์‹๋œ ์›์ˆญ์ด์˜ RNA-seq ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•˜์˜€๊ณ  T ์„ธํฌ ์ˆ˜์šฉ์ฒด ์‹ ํ˜ธ ์ „๋‹ฌ ํŒจ์Šค์›จ์ด(T cell receptor signalling pathway)๊ฐ€ ํ™œ์„ฑํ™” ๋˜์—ˆ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค. ํ•ด๋‹น ์›์ˆญ์ด์˜ ๊ฐ„ ์ƒ˜ํ”Œ์„ ์ƒ๊ฒ€ํ•˜์—ฌ CD3+ T ์„ธํฌ๊ฐ€ ์ด์‹๋œ ์ทŒ๋„์— ์นจํˆฌํ•˜์˜€์Œ์„ ํ™•์ธํ•จ์œผ๋กœ์จ ๋ถ„์„ ๊ฒฐ๊ณผ๊ฐ€ ์‹ค์ œ ๊ฒฐ๊ณผ์™€ ์ผ์น˜ํ•จ์„ ํ™•์ธํ•˜์˜€๋‹ค. ํ•œํŽธ, ๋„คํŠธ์›Œํฌ ์ „ํŒŒ๋ฅผ ์ด์šฉํ•˜์—ฌ ๋‹ค์„ฏ ๊ฐ€์ง€ ๊ฑฐ๋ถ€ ๋ฐ˜์‘ ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ๊ฒ€์ฆํ•˜์˜€๊ณ  T ์„ธํฌ๋กœ ์ธํ•œ ๊ฑฐ๋ถ€๋ฐ˜์‘์ด ๊ฐ€์žฅ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๊ฒฐ๋ก ์ ์œผ๋กœ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹ค์–‘ํ•œ ์ „์‚ฌ์ฒด ๋ฐ์ดํ„ฐ ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•จ์— ์žˆ์–ด์„œ ๋„คํŠธ์›Œํฌ ์ •๋ณด, ๋„คํŠธ์›Œํฌ ํŠน์„ฑ ๋ฐ ๋„คํŠธ์›Œํฌ ์ „ํŒŒ๋ฅผ ์ด์šฉํ•œ ๋„คํŠธ์›Œํฌ ๋ถ„์„ ๋ฐ ๊ธฐ๊ณ„ํ•™์Šต ๊ธฐ๋ฒ•์ด ์œ ์šฉํ•จ์„ ๋ณด์˜€๋‹ค.Abstract Chapter 1. Introduction 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 An introduction to network theory and its application to the fields of biology . . . . . . . . . . . . . . . . . . . . . 2 1.1.2 An introduction to machine learning . . . . . . . . . . . . 5 1.2 Three problems in my doctoral study . . . . . . . . . . . . . . . . 6 1.2.1 Problem 1: DEG detection . . . . . . . . . . . . . . . . . 6 1.2.2 Problem 2: Pathway interaction analysis . . . . . . . . . . 8 1.2.3 Problem 3: Analysis of transcriptome from pig-to-nonhuman primate islet xenotransplantation . . . . . . . . . . . . . . 10 1.3 My network-based approaches to three research problems . . . . 11 1.4 Outline of thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chapter 2. A machine learning approach to identify differentially expressed genes using network property and network propagation 14 2.1 Background of differential expression analysis methods . . . . . . 15 2.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.2 My machine learning approach . . . . . . . . . . . . . . . 17 2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2.1 Training and Test Data . . . . . . . . . . . . . . . . . . . 19 2.2.2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.3 Network Property . . . . . . . . . . . . . . . . . . . . . . 22 2.2.4 Network Propagation . . . . . . . . . . . . . . . . . . . . 23 2.2.5 Machine Learning Algorithm . . . . . . . . . . . . . . . . 24 2.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . 26 2.3.1 Experimental Data Description . . . . . . . . . . . . . . . 26 2.3.2 Performance of Network Information Features . . . . . . . 30 2.3.3 Performance Evaluation and Discussion . . . . . . . . . . 34 2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Chapter 3. Construction of condition-specific pathway interaction network by computing shortest paths on weighted PPI 38 3.1 Background of pathway interaction network construction . . . . . 39 3.1.1 The importance of finding perturbed interaction between pathways . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.1.2 Challenges in pathway interaction network construction . 40 3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2.1 Preparation of PPI and pathway information . . . . . . . 41 3.2.2 Defining edges in the pathway network . . . . . . . . . . . 42 3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.1 Data description . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.2 Evaluation criteria . . . . . . . . . . . . . . . . . . . . . . 49 3.3.3 Performance comparison to other methods . . . . . . . . . 51 3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Chapter 4. Bioinformatics analyses with peripheral blood RNA-sequencing unveiled the cause of the graft loss after pig-to-nonhuman primate islet xenotransplantation model 63 4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.2.1 Peripheral blood RNA sequencing . . . . . . . . . . . . . 65 4.2.2 Graft loss period-related activated pathways (GLPAPs) defined by TRAP (Time-series RNA-seq analysis package) 66 4.2.3 Pathway interaction network analysis . . . . . . . . . . . 72 4.2.4 Hypothesis evaluation using network propagation . . . . . 75 4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Chapter 5 Conclusion 83 ์ดˆ๋ก 103Docto

    Pathway Bridge Based Multiobjective Optimization Approach for Lurking Pathway Prediction

    Get PDF
    Ovarian carcinoma immunoreactive antigen-like protein 2 (OCIAD2) is a protein with unknown function. Frequently methylated or downregulated, OCIAD2 has been observed in kinds of tumors, and TGFฮฒ signaling has been proved to induce the expression of OCIAD2. However, current pathway analysis tools do not cover the genes without reported interactions like OCIAD2 and also miss some significant genes with relatively lower expression. To investigate potential biological milieu of OCIAD2, especially in cancer microenvironment, a nova approach pbMOO was created to find the potential pathways from TGFฮฒ to OCIAD2 by searching on the pathway bridge, which consisted of cancer enriched looping patterns from the complicated entire protein interactions network. The pbMOO approach was further applied to study the modulator of ligand TGFฮฒ1, receptor TGFฮฒR1, intermediate transfer proteins, transcription factor, and signature OCIAD2. Verified by literature and public database, the pathway TGFฮฒ1- TGFฮฒR1- SMAD2/3- SMAD4/AR-OCIAD2 was detected, which concealed the androgen receptor (AR) which was the possible transcription factor of OCIAD2 in TGFฮฒ signal, and it well explained the mechanism of TGFฮฒ induced OCIAD2 expression in cancer microenvironment, therefore providing an important clue for the future functional analysis of OCIAD2 in tumor pathogenesis

    Identification of 64 Novel Genetic Loci Provides an Expanded View on the Genetic Architecture of Coronary Artery Disease

    Get PDF
    Rationale: Coronary artery disease (CAD) is a complex phenotype driven by genetic and environmental factors. Ninety-seven genetic risk loci have been identified to date, but the identification of additional susceptibility loci might be important to enhance our understanding of the genetic architecture of CAD. Objective: To expand the number of genome-wide significant loci, catalog functional insights, and enhance our understanding of the genetic architecture of CAD. Methods and Results: We performed a genome-wide association study in 34541 CAD cases and 261984 controls of UK Biobank resource followed by replication in 88192 cases and 162544 controls from CARDIoGRAMplusC4D. We identified 75 loci that replicated and were genome-wide significant (P Conclusions: We identified 64 novel genetic risk loci for CAD and performed fine mapping of all 161 risk loci to obtain a credible set of causal variants. The large expansion of reconstituted gene sets argues in favor of an expanded omnigenic model view on the genetic architecture of CAD
    • โ€ฆ
    corecore