37 research outputs found

    Tracking Cancer Genetic Evolution using OncoTrack

    Get PDF
    abstract: It is difficult for existing methods to quantify, and track the constant evolution of cancers due to high heterogeneity of mutations. However, structural variations associated with nucleotide number changes show repeatable patterns in localized regions of the genome. Here we introduce SPKMG, which generalizes nucleotide number based properties of genes, in statistical terms, at the genome-wide scale. It is measured from the normalized amount of aligned NGS reads in exonic regions of a gene. SPKMG values are calculated within OncoTrack. SPKMG values being continuous numeric variables provide a statistical metric to track DNA level changes. We show that SPKMG measures of cancer DNA show a normative pattern at the genome-wide scale. The analysis leads to the discovery of core cancer genes and also provides novel dynamic insights into the stage of cancer, including cancer development, progression, and metastasis. This technique will allow exome data to also be used for quantitative LOH/CNV analysis for tracking tumour progression and evolution with a higher efficiency.The final version of this article, as published in Scientific Reports, can be viewed online at: https://www.nature.com/articles/srep2964

    Mechanisms of mutational robustness in transcriptional regulation

    Get PDF
    Robustness is the invariance of a phenotype in the face of environmental or genetic change. The phenotypes produced by transcriptional regulatory circuits are gene expression patterns that are to some extent robust to mutations. Here we review several causes of this robustness. They include robustness of individual transcription factor binding sites, homotypic clusters of such sites, redundant enhancers, transcription factors, redundant transcription factors, and the wiring of transcriptional regulatory circuits. Such robustness can either be an adaptation by itself, a byproduct of other adaptations, or the result of biophysical principles and non-adaptive forces of genome evolution. The potential consequences of such robustness include complex regulatory network topologies that arise through neutral evolution, as well as cryptic variation, i.e., genotypic divergence without phenotypic divergence. On the longest evolutionary timescales, the robustness of transcriptional regulation has helped shape life as we know it, by facilitating evolutionary innovations that helped organisms such as flowering plants and vertebrates diversify

    Functional and Topological Properties in Hepatocellular Carcinoma Transcriptome

    Get PDF
    Hepatocellular carcinoma (HCC) is a leading cause of global cancer mortality. However, little is known about the precise molecular mechanisms involved in tumor formation and pathogenesis. The primary goal of this study was to elucidate genome-wide molecular networks involved in development of HCC with multiple etiologies by exploring high quality microarray data. We undertook a comparative network analysis across 264 human microarray profiles monitoring transcript changes in healthy liver, liver cirrhosis, and HCC with viral and alcoholic etiologies. Gene co-expression profiling was used to derive a consensus gene relevance network of HCC progression that consisted of 798 genes and 2,012 links. The HCC interactome was further confirmed to be phenotype-specific and non-random. Additionally, we confirmed that co-expressed genes are more likely to share biological function, but not sub-cellular localization. Analysis of individual HCC genes revealed that they are topologically central in a human protein-protein interaction network. We used quantitative RT-PCR in a cohort of normal liver tissue (n = 8), hepatitis C virus (HCV)-induced chronic liver disease (n = 9), and HCC (n = 7) to validate co-expressions of several well-connected genes, namely ASPM, CDKN3, NEK2, RACGAP1, and TOP2A. We show that HCC is a heterogeneous disorder, underpinned by complex cross talk between immune response, cell cycle, and mRNA translation pathways. Our work provides a systems-wide resource for deeper understanding of molecular mechanisms in HCC progression and may be used further to define novel targets for efficient treatment or diagnosis of this disease

    Darwin throws dice: modelling stochastic processes of molecular evolution

    Get PDF
    The availability of protein and DNA sequences in the second half of the 20th century revolutionised evolutionary biology. For the first time, it was possible to quantify genetic variation among individuals and populations. Using molecular data to understand past demography and natural selection became an attainable goal. In the era of whole-genome sequences, application of early theoretical results proved to be challenging. The stochastic nature of evolutionary processes acting on DNA sequences makes it hard to distinguish signal from noise. Although progress has been made, models of molecular evolution are still lagging behind the availability of sequence data. In this thesis I contribute to bridging this gap, even if slightly. My main result is the development of the integrated sequentially Markovian coalescent (iSMC) – a novel framework that jointly models the effects of ancestral demography, recombination heterogeneity (Chapter 1) and mutation heterogeneity (Chapter 2) in shaping genetic diversity along the genome. This principled approach represents a step towards more realistic models of Population Genetics. The consequences of intracellular stochasticity extend beyond DNA sequences, however. Due to randomness in the diffusion of key molecules, isogenic cells differ in their gene expression patterns – hence in their phenotypes – even in homogeneous environments. To avoid chaos, intracellular stochasticity must be tamed by natural selection. In Chapter 3, I leverage single-cell transcriptomics data to disentangle the factors that constrain gene expression noise. Although selection against elevated noise acts at different levels of organisation, I show that it responds primarily to the architecture of molecular networks. This result may impact our understanding of the genotype-phenotype-fitness map

    Genotype-Phenotype Maps in Complex Living Systems

    Full text link

    Immunoinformatics: towards an understanding of species-specific protein evolution using phylogenomics and network theory

    Get PDF
    In immunology, the mouse is unquestionably the predominant model organism. However, an increasing number of reports suggest that mouse models do not always mimic human innate immunology. To better understand this discordance at the molecular level, we are investigating two mechanisms of gene evolution: positive selection and gene remodeling by introgression/domain shuffling. We began by creating a bioinformatic pipeline for large-scale evolutionary analyses. We next investigated bowhead genomic data to test our pipeline and to determine if there is lineage specific positive selection in particular whale lineages. Positive selection is a molecular signature of adaptation, and therefore, potential protein functional divergence. Once we had the pipeline troubleshot using the low quality bowhead data we moved on to test our innate immune dataset for lineage specific selective pressures. When possible, we applied population genomics theory to identify potential false-positives and date putative positive selection events in human. The final phase of our analysis uses network (graph) theory to identify genes remodeled by domain shuffling/introgression and to identify species-specific introgressive events. Introgressive events potentially impart novel function and may also alter interactions within a protein network. By identifying genes displaying evidence of positive selection or introgression, we may begin to understand the molecular underpinnings of phenotypic discordance between human and mouse immune systems

    Computationally Comparing Biological Networks and Reconstructing Their Evolution

    Get PDF
    Biological networks, such as protein-protein interaction, regulatory, or metabolic networks, provide information about biological function, beyond what can be gleaned from sequence alone. Unfortunately, most computational problems associated with these networks are NP-hard. In this dissertation, we develop algorithms to tackle numerous fundamental problems in the study of biological networks. First, we present a system for classifying the binding affinity of peptides to a diverse array of immunoglobulin antibodies. Computational approaches to this problem are integral to virtual screening and modern drug discovery. Our system is based on an ensemble of support vector machines and exhibits state-of-the-art performance. It placed 1st in the 2010 DREAM5 competition. Second, we investigate the problem of biological network alignment. Aligning the biological networks of different species allows for the discovery of shared structures and conserved pathways. We introduce an original procedure for network alignment based on a novel topological node signature. The pairwise global alignments of biological networks produced by our procedure, when evaluated under multiple metrics, are both more accurate and more robust to noise than those of previous work. Next, we explore the problem of ancestral network reconstruction. Knowing the state of ancestral networks allows us to examine how biological pathways have evolved, and how pathways in extant species have diverged from that of their common ancestor. We describe a novel framework for representing the evolutionary histories of biological networks and present efficient algorithms for reconstructing either a single parsimonious evolutionary history, or an ensemble of near-optimal histories. Under multiple models of network evolution, our approaches are effective at inferring the ancestral network interactions. Additionally, the ensemble approach is robust to noisy input, and can be used to impute missing interactions in experimental data. Finally, we introduce a framework, GrowCode, for learning network growth models. While previous work focuses on developing growth models manually, or on procedures for learning parameters for existing models, GrowCode learns fundamentally new growth models that match target networks in a flexible and user-defined way. We show that models learned by GrowCode produce networks whose target properties match those of real-world networks more closely than existing models

    From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics

    Get PDF
    Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves into a critical and constructive attitude in our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.Comment: 111 pages, 11 figures uses elsarticle latex clas

    Living with noise: The evolution of gene expression noise in gene regulatory networks

    Get PDF
    One of the keystones of evolutionary biology is the study of how organismal traits change in time. Technological advancements in the past twenty years have enabled us to study the variation of an important trait, gene expression level, at single cell resolution. One of the sources of gene expression level variation is gene expression noise, a result of the innate stochasticity of the gene expression process. Gene expression noise is gene-specific and can be tuned by selection, but what drives the evolution of gene-specific expression noise remains an open question. In this thesis, I explore the selective pressure and evolvability of gene-specific expression noise in gene regulatory networks. I use evolutionary simulations by applying rounds of mutation, recombination and reproduction to populations of model gene regulatory networks in different selection scenarios. In the first chapter, I investigate the response of gene-specific expression noise in gene regulatory networks in constant environments, which imposes stabilizing selection on gene expression level. The probability of responding to selection and the strength of the selective response was affected by local network centrality metrics. Furthermore, global network features, such as network diameter, centralization and average degree affected the average expression variance and average selective pressure acting on constituent genes. In the second chapter, I investigate the response of mean gene expression level and gene-specific expression noise in isolated genes and genes in gene regulatory networks in changing environments. Gene-specific expression noise of genes increased under fluctuating selection, indicating the evolution of a bet-hedging strategy. Under directional selection gene-specific expression noise transiently increased, showing that expression noise plays a role in the adaptation process towards a new mean expression optimum
    corecore