746 research outputs found

    A module based approach for identifying driver genes and expanding pathways from integrated biological networks

    Full text link
    Each gene or protein has its own function which, when combined with others, allows the group to perform more complex behaviors, e.g. carry out a particular cellular task (functional module) or affect a particular disease phenotype (disease module). One of the major challenges in systems biology is to reveal the roles of genes or proteins in functional modules or disease modules. In the first part of the dissertation, I present a data-driven method, Correlation Set Analysis (CSA), for comprehensively detecting active regulators in disease populations by integrating co-expression analysis and specific types of literature-derived causal relationships. Instead of investigating the co-expression level between regulators and their targets, I focus on coherence of regulatees of a regulator, e.g. downstream targets of a transcription factor. Using simulated datasets I show that my method can reach high true positive rate and true negative rate (>80%) even the regulatory relationships is weak (only 20% of regulatees are co-expressed). Using three separate real biological datasets I was able to recover well-known and as- yet undescribed, active regulators for each disease population. In the second part of the dissertation, I develop and apply a new computational algorithm for detecting modules of functionally related genes that are likely to drive malignant transformation. The algorithm takes as input the identity and locations of a small number of known oncogenes (a seed set) on a human genome functional linkage network (FLN). It then searches for a boundary surrounding a gene set encompassing the seed, such that the magnitude of the difference in linkage weights between interior-interior gene pairs, and interior-exterior gene pairs is maximized. Starting with small seed sets for breast and ovarian cancer, I successfully identify known and novel drivers in both cancer types. In the third part of the dissertation, I propose a module based approach for expanding manually curated functional modules. I use the KEGG pathway database as an example and the results show that my approach can successfully suggest both validated pathway members (genes that are assigned to a particular pathway by other manually curated pathway databases) and novel candidate pathway genes

    Role of network topology based methods in discovering novel gene-phenotype associations

    Get PDF
    The cell is governed by the complex interactions among various types of biomolecules. Coupled with environmental factors, variations in DNA can cause alterations in normal gene function and lead to a disease condition. Often, such disease phenotypes involve coordinated dysregulation of multiple genes that implicate inter-connected pathways. Towards a better understanding and characterization of mechanisms underlying human diseases, here, I present GUILD, a network-based disease-gene prioritization framework. GUILD associates genes with diseases using the global topology of the protein-protein interaction network and an initial set of genes known to be implicated in the disease. Furthermore, I investigate the mechanistic relationships between disease-genes and explain the robustness emerging from these relationships. I also introduce GUILDify, an online and user-friendly tool which prioritizes genes for their association to any user-provided phenotype. Finally, I describe current state-of-the-art systems-biology approaches where network modeling has helped extending our view on diseases such as cancer.La cèl•lula es regeix per interaccions complexes entre diferents tipus de biomolècules. Juntament amb factors ambientals, variacions en el DNA poden causar alteracions en la funció normal dels gens i provocar malalties. Sovint, aquests fenotips de malaltia involucren una desregulació coordinada de múltiples gens implicats en vies interconnectades. Per tal de comprendre i caracteritzar millor els mecanismes subjacents en malalties humanes, en aquesta tesis presento el programa GUILD, una plataforma que prioritza gens relacionats amb una malaltia en concret fent us de la topologia de xarxe. A partir d’un conjunt conegut de gens implicats en una malaltia, GUILD associa altres gens amb la malaltia mitjancant la topologia global de la xarxa d’interaccions de proteïnes. A més a més, analitzo les relacions mecanístiques entre gens associats a malalties i explico la robustesa es desprèn d’aquesta anàlisi. També presento GUILDify, un servidor web de fácil ús per la priorització de gens i la seva associació a un determinat fenotip. Finalment, descric els mètodes més recents en què el model•latge de xarxes ha ajudat extendre el coneixement sobre malalties complexes, com per exemple a càncer

    Computational Labeling, Partitioning, and Balancing of Molecular Networks

    Get PDF
    Recent advances in high throughput techniques enable large-scale molecular quantification with high accuracy, including mRNAs, proteins and metabolites. Differential expression of these molecules in case and control samples provides a way to select phenotype-associated molecules with statistically significant changes. However, given the significance ranking list of molecular changes, how those molecules work together to drive phenotype formation is still unclear. In particular, the changes in molecular quantities are insufficient to interpret the changes in their functional behavior. My study is aimed at answering this question by integrating molecular network data to systematically model and estimate the changes of molecular functional behaviors. We build three computational models to label, partition, and balance molecular networks using modern machine learning techniques. (1) Due to the incompleteness of protein functional annotation, we develop AptRank, an adaptive PageRank model for protein function prediction on bilayer networks. By integrating Gene Ontology (GO) hierarchy with protein-protein interaction network, our AptRank outperforms four state-of-the-art methods in a comprehensive evaluation using benchmark datasets. (2) We next extend our AptRank into a network partitioning method, BioSweeper, to identify functional network modules in which molecules share similar functions and also densely connect to each other. Compared to traditional network partitioning methods using only network connections, BioSweeper, which integrates the GO hierarchy, can automatically identify functionally enriched network modules. (3) Finally, we conduct a differential interaction analysis, namely difFBA, on protein-protein interaction networks by simulating protein fluxes using flux balance analysis (FBA). We test difFBA using quantitative proteomic data from colon cancer, and demonstrate that difFBA offers more insights into functional changes in molecular behavior than does protein quantity changes alone. We conclude that our integrative network model increases the observational dimensions of complex biological systems, and enables us to more deeply understand the causal relationships between genotypes and phenotypes

    A bioinformatics potpourri

    Full text link
    © 2018 The Author(s). The 16th International Conference on Bioinformatics (InCoB) was held at Tsinghua University, Shenzhen from September 20 to 22, 2017. The annual conference of the Asia-Pacific Bioinformatics Network featured six keynotes, two invited talks, a panel discussion on big data driven bioinformatics and precision medicine, and 66 oral presentations of accepted research articles or posters. Fifty-seven articles comprising a topic assortment of algorithms, biomolecular networks, cancer and disease informatics, drug-target interactions and drug efficacy, gene regulation and expression, imaging, immunoinformatics, metagenomics, next generation sequencing for genomics and transcriptomics, ontologies, post-translational modification, and structural bioinformatics are the subject of this editorial for the InCoB2017 supplement issues in BMC Genomics, BMC Bioinformatics, BMC Systems Biology and BMC Medical Genomics. New Delhi will be the location of InCoB2018, scheduled for September 26-28, 2018

    Modular architecture in biological networks

    Get PDF
    Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2007.Includes bibliographical references (p. 201-207).In the past decade, biology has been revolutionized by an explosion in the availability of data. Translating this new wealth of information into meaningful biological insights and clinical breakthroughs will require a complete overhaul both in the questions being asked, and the methodologies used to answer them. One of the largest challenges in organizing and understanding the data coming from genome sequencing, microarray experiments, and other high-throughput measurements, will be the ability to find large-scale structure in biological systems. Ideally, this would lead to a simplified representation, wherein the thousands of genes in an organism can be viewed as a much smaller number of dynamic modules working in concert to accomplish cellular functions. Toward demonstrating the importance of higher-level, modular structure in biological systems, we have performed the following analyses: 1. Using computational techniques and pre-existing protein-protein interaction (PPI) data, we have developed general tools to find and validate modular structure. We have applied these approaches to the PPI networks of yeast, fly, worm, and human.(cont.) 2. Utilizing a modular scaffold, we have generated predictions that attempt to explain existing system-wide experiments as well as predict the function of otherwise uncharacterized proteins. 3. Following the example of comparative genomics, we have aligned biological networks at the modular level to elucidate principles of how modules evolve. We show that conserved modular structure can further aid in functional annotation across the proteome. In addition to the detection and use of modular structure for computational analyses, experimental techniques must be adapted to support top-down strategies, and the targeting of entire modules with combinations of small-molecules. With this in mind, we have designed experimental strategies to find sets of small-molecules capable of perturbing fimctional modules through a variety of distinct, but related, mechanisms. As a first test, we have looked for classes of small-molecules targeting growth signaling through the phosphatidyl-inositol-3-kinase (PI3K) pathway. This provides a platform for developing new screening techniques in the setting of biology relevant to diabetes and cancer. In combination, these investigations provide an extensible computational approach to finding and utilizing modular structure in biological networks, and experimental approaches to bring them toward clinical endpoints.by Gopal Ramachandran.Ph.D

    Systematic prediction of feedback regulatory network motifs

    Full text link
    Comprendre le câblage complexe de la régulation cellulaire reste un défi des plus redoutables.Les connaissances fondamentales sur le câblage et le fonctionnement du réseau d’homéostasiedes protéines aideront à mieux comprendre comment l’homéostasie des protéines échouedans les maladies et comment les modèles de régulation du réseau d’homéostasie desprotéines peuvent être ciblés pour une intervention thérapeutique. L’étude vise à développeret à appliquer une nouvelle méthodologie de calcul pour l’identification systématique etla caractérisation des systèmes de rétroaction en homéostasie des protéines. La rechercheproposée combine des idées et des approches issues de la science des protéines, de la biologiedes systèmes de levure, de la biologie computationnelle et de la biologie des réseaux.La difficulté dans la tâche d’incorporer des données multi-plateformes multi-omiques estamplifiée par le vaste réseau de gènes, protéines et métabolites interconnectés qui seréunissent pour remplir une fonction spécifique. Pour ma thèse de maîtrise, j’ai développéun algorithme PBPF (Path-Based Pattern Finding), qui recherche et énumère les motifsde réseau de la topologie requise. Il s’agit d’un algorithme basé sur la théorie des graphesqui utilise la combinaison d’une méthode transversale de profondeur et d’une méthodede recherche par largeur ensuite pour identifier les topologies de sous-graphes de réseaurequises. En outre, le fonctionnement de l’algorithme a été démontré dans les domainesde l’homéostasie des protéines chezSaccharomyces cerevisiae. Une approche systématiqued’intégration des données de la biologie des systèmes a été orchestrée, qui montre l’iden-tification systématique de motifs de rétroaction régulatrice connus dans l’homéostasie desprotéines. Il revendique fortement la capacité d’identifier de nouveaux motifs de rétroactionréglementaire envahissants. L’application de l’algorithme peut être étendue à d’autressystèmes biologiques, par exemple, pour identifier des motifs de rétroaction spécifiques àl’état cellulaire dans le cas de cellules souches.Understanding the intricate wiring of cellular regulation remains a most formidable chal-lenge. The fundamental insights into the wiring and functioning of the protein homeostasisnetwork will help to better understand how protein homeostasis fails in diseases and howthe regulatory patterns of protein homeostasis network can be targeted for therapeuticintervention. The study aims at developing and applying novel computational methodologyfor the systematic identification and characterization of feedback systems in proteinhomeostasis. The proposed research combines ideas and approaches from protein science,yeast systems biology, computational biology, as well as network biology. The difficultyin the task of incorporating multi-platform multi-omics data is amplified by the largenetwork of inter-connected genes, proteins and metabolites that come together to perform aspecific function. For my master’s thesis, I developed a path-based pattern finding (PBPF)algorithm, which searches and enumerates network motifs of required topology. It is a graphtheory based algorithm which utilizes the combination of depth-first transverse method andbreadth-first search method to identify the required network sub-graph topologies. Further,the functioning of the algorithm has been demonstrated in the realms of protein homeostasisinSaccharomyces cerevisiae. A systematic approach of integration of systems biologydata has been orchestrated, which shows the systematic identification of known regulatoryfeedback motifs in protein homeostasis. It claims the unique ability to identify novelpervasive regulatory feedback motifs. The application of the algorithm can be extended toother biological systems, for example, to identify cell-state specific feedback motifs in caseof stem-cells

    Identifying aging-related genes in mouse hippocampus using gateway nodes

    Get PDF
    BACKGROUND: High-throughput studies continue to produce volumes of metadata representing valuable sources of information to better guide biological research. With a stronger focus on data generation, analysis models that can readily identify actual signals have not received the same level of attention. This is due in part to high levels of noise and data heterogeneity, along with a lack of sophisticated algorithms for mining useful information. Networks have emerged as a powerful tool for modeling high-throughput data because they are capable of representing not only individual biological elements but also different types of relationships en masse. Moreover, well-established graph theoretic methodology can be applied to network models to increase efficiency and speed of analysis. In this project, we propose a network model that examines temporal data from mouse hippocampus at the transcriptional level via correlation of gene expression. Using this model, we formally define the concept of “gateway” nodes, loosely defined as nodes representing genes co-expressed in multiple states. We show that the proposed network model allows us to identify target genes implicated in hippocampal aging-related processes. RESULTS: By mining gateway genes related to hippocampal aging from networks made from gene expression in young and middle-aged mice, we provide a proof-of-concept of existence and importance of gateway nodes. Additionally, these results highlight how network analysis can act as a supplement to traditional statistical analysis of differentially expressed genes. Finally, we use the gateway nodes identified by our method as well as functional databases and literature to propose new targets for study of aging in the mouse hippocampus. CONCLUSIONS: This research highlights the need for methods of temporal comparison using network models and provides a systems biology approach to extract information from correlation networks of gene expression. Our results identify a number of genes previously implicated in the aging mouse hippocampus related to synaptic plasticity and apoptosis. Additionally, this model identifies a novel set of aging genes previously uncharacterized in the hippocampus. This research can be viewed as a first-step for identifying the processes behind comparative experiments in aging that is applicable to any type of temporal multi-state network
    corecore