3,270 research outputs found

    The context-dependence of mutations: a linkage of formalisms

    Full text link
    Defining the extent of epistasis - the non-independence of the effects of mutations - is essential for understanding the relationship of genotype, phenotype, and fitness in biological systems. The applications cover many areas of biological research, including biochemistry, genomics, protein and systems engineering, medicine, and evolutionary biology. However, the quantitative definitions of epistasis vary among fields, and its analysis beyond just pairwise effects remains obscure in general. Here, we show that different definitions of epistasis are versions of a single mathematical formalism - the weighted Walsh-Hadamard transform. We discuss that one of the definitions, the backgound-averaged epistasis, is the most informative when the goal is to uncover the general epistatic structure of a biological system, a description that can be rather different from the local epistatic structure of specific model systems. Key issues are the choice of effective ensembles for averaging and to practically contend with the vast combinatorial complexity of mutations. In this regard, we discuss possible approaches for optimally learning the epistatic structure of biological systems.Comment: 6 pages, 3 figures, supplementary informatio

    Variation of the adaptive substitution rate between species and within genomes

    No full text
    The importance of adaptive mutations in molecular evolution is extensively debated. Recent developments in population genomics allow inferring rates of adaptive mutations by fitting a distribution of fitness effects to the observed patterns of polymorphism and divergence at sites under selection and sites assumed to evolve neutrally. Here, we summarize the current state-of-the-art of these methods and review the factors that affect the molecular rate of adaptation. Several studies have reported extensive cross-species variation in the proportion of adaptive amino-acid substitutions (Ī±) and predicted that species with larger effective population sizes undergo less genetic drift and higher rates of adaptation. Disentangling the rates of positive and negative selection, however, revealed that mutations with deleterious effects are the main driver of this population size effect and that adaptive substitution rates vary comparatively little across species. Conversely, rates of adaptive substitution have been documented to vary substantially within genomes. On a genome-wide scale, gene density, recombination and mutation rate were observed to play a role in shaping molecular rates of adaptation, as predicted under models of linked selection. At the gene level, it has been reported that the gene functional category and the macromolecular structure substantially impact the rate of adaptive mutations. Here, we deliver a comprehensive review of methods used to infer the molecular adaptive rate, the potential drivers of adaptive evolution and how positive selection shapes molecular evolution within genes, across genes within species and between species

    Protein function annotation using protein domain family resources

    Get PDF
    As a result of the genome sequencing and structural genomics initiatives, we have a wealth of protein sequence and structural data. However, only about 1% of these proteins have experimental functional annotations. As a result, computational approaches that can predict protein functions are essential in bridging this widening annotation gap. This article reviews the current approaches of protein function prediction using structure and sequence based classification of protein domain family resources with a special focus on functional families in the CATH-Gene3D resource

    Unravelling the determinants of the rate of adaptive evolution at the molecular level

    Get PDF
    Ever since Darwin presented natural selection as a driver of evolution, evolutionary biologists have thrived to understand how beneficial mutations shape species adaptation to their environment. Studying adaptation, however, requires an understanding of the complex dynamics between nucleotides, sequences, proteins, organisms, populations, and species. In other words, it requires assessing the interplay of evolutionary processes across systems. Here, I studied adaptation in such a way by exploring the frequency and nature of adaptive mutations within genes, within genomes, and between species. At the intramolecular level, this project revealed that the residueā€™s solvent accessibility acts as the primary determinant of rates of adaptive substitutions both in animals and in plants, where adaptive mutations are more frequent at the protein surface. These analyses further showed higher rates of adaptation for genes encoding proteins with central cellular functions, which are the ones usually targeted by pathogens during host infection. These findings, therefore, suggested that protein adaptive evolution proceeds through interactions between molecules, particularly at the interspecific level, where host-pathogen coevolution likely plays a central role. By taking a step back and looking at adaptation at different time-scales within the genome, this thesis revealed the role of young genes in adaptive evolution. As these genes are further away from their fitness optimum, these findings suggested that proteins adapt in an ā€œadaptive walkā€ manner. This project further highlighted that the distribution of adaptive mutations across time follows a pattern of diminishing returns. Looking at an even broader scale by studying adaptation at the species level and considering the effect of intramolecular variation across several animal species, this thesis demonstrated a negative correlation between rates of adaptive substitutions and the effective population size (N_e). Despite the relatively weak signal, these findings contradict initial population genetics theory. Instead, they seem to agree with theoretical expectations at the phenotypic space. In turn, the results regarding negative selection confirm the N_e hypothesis, where the efficiency of selection is stronger in large-N_e species. This effect was well depicted in the differences of the distribution of fitness effects between buried and exposed residues, where the former accumulates comparatively more mild effect mutations in low-N_e species. This project further expanded our findings at the intramolecular level, by revealing the strong influence of the proteinā€™s macromolecular structure on rates of molecular adaptation across several taxa. By assessing the interplay of adaptive mutations across distinct organizational levels, this thesis provided a more profound understanding of rates of adaptive evolution at the molecular level, thus delivering a comprehensive view of the molecular basis of adaptation

    The EM Algorithm and the Rise of Computational Biology

    Get PDF
    In the past decade computational biology has grown from a cottage industry with a handful of researchers to an attractive interdisciplinary field, catching the attention and imagination of many quantitatively-minded scientists. Of interest to us is the key role played by the EM algorithm during this transformation. We survey the use of the EM algorithm in a few important computational biology problems surrounding the "central dogma"; of molecular biology: from DNA to RNA and then to proteins. Topics of this article include sequence motif discovery, protein sequence alignment, population genetics, evolutionary models and mRNA expression microarray data analysis.Comment: Published in at http://dx.doi.org/10.1214/09-STS312 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Revealing evolutionary constraints on proteins through sequence analysis

    Full text link
    Statistical analysis of alignments of large numbers of protein sequences has revealed "sectors" of collectively coevolving amino acids in several protein families. Here, we show that selection acting on any functional property of a protein, represented by an additive trait, can give rise to such a sector. As an illustration of a selected trait, we consider the elastic energy of an important conformational change within an elastic network model, and we show that selection acting on this energy leads to correlations among residues. For this concrete example and more generally, we demonstrate that the main signature of functional sectors lies in the small-eigenvalue modes of the covariance matrix of the selected sequences. However, secondary signatures of these functional sectors also exist in the extensively-studied large-eigenvalue modes. Our simple, general model leads us to propose a principled method to identify functional sectors, along with the magnitudes of mutational effects, from sequence data. We further demonstrate the robustness of these functional sectors to various forms of selection, and the robustness of our approach to the identification of multiple selected traits.Comment: 37 pages, 28 figure

    Protein co-evolution, co-adaptation and interactions

    Get PDF
    Co-evolution has an important function in the evolution of species and it is clearly manifested in certain scenarios such as hostā€“parasite and predatorā€“prey interactions, symbiosis and mutualism. The extrapolation of the concepts and methodologies developed for the study of species co-evolution at the molecular level has prompted the development of a variety of computational methods able to predict protein interactions through the characteristics of co-evolution. Particularly successful have been those methods that predict interactions at the genomic level based on the detection of pairs of protein families with similar evolutionary histories (similarity of phylogenetic trees: mirrortree). Future advances in this field will require a better understanding of the molecular basis of the co-evolution of protein families. Thus, it will be important to decipher the molecular mechanisms underlying the similarity observed in phylogenetic trees of interacting proteins, distinguishing direct specific molecular interactions from other general functional constraints. In particular, it will be important to separate the effects of physical interactions within protein complexes (ā€˜co-adaptation') from other forces that, in a less specific way, can also create general patterns of co-evolution
    • ā€¦
    corecore