1,297 research outputs found

    Selective Constraints on Amino Acids Estimated by a Mechanistic Codon Substitution Model with Multiple Nucleotide Changes

    Get PDF
    Empirical substitution matrices represent the average tendencies of substitutions over various protein families by sacrificing gene-level resolution. We develop a codon-based model, in which mutational tendencies of codon, a genetic code, and the strength of selective constraints against amino acid replacements can be tailored to a given gene. First, selective constraints averaged over proteins are estimated by maximizing the likelihood of each 1-PAM matrix of empirical amino acid (JTT, WAG, and LG) and codon (KHG) substitution matrices. Then, selective constraints specific to given proteins are approximated as a linear function of those estimated from the empirical substitution matrices. Akaike information criterion (AIC) values indicate that a model allowing multiple nucleotide changes fits the empirical substitution matrices significantly better. Also, the ML estimates of transition-transversion bias obtained from these empirical matrices are not so large as previously estimated. The selective constraints are characteristic of proteins rather than species. However, their relative strengths among amino acid pairs can be approximated not to depend very much on protein families but amino acid pairs, because the present model, in which selective constraints are approximated to be a linear function of those estimated from the JTT/WAG/LG/KHG matrices, can provide a good fit to other empirical substitution matrices including cpREV for chloroplast proteins and mtREV for vertebrate mitochondrial proteins. The present codon-based model with the ML estimates of selective constraints and with adjustable mutation rates of nucleotide would be useful as a simple substitution model in ML and Bayesian inferences of molecular phylogenetic trees, and enables us to obtain biologically meaningful information at both nucleotide and amino acid levels from codon and protein sequences.Comment: Table 9 in this article includes corrections for errata in the Table 9 published in 10.1371/journal.pone.0017244. Supporting information is attached at the end of the article, and a computer-readable dataset of the ML estimates of selective constraints is available from 10.1371/journal.pone.001724

    Sequencing and Analysis of Plastid Genome in Mycoheterotrophic Orchid Neottia nidus-avis

    Get PDF
    Plastids are the semiautonomous organelles that possess their own genome inherited from the cyanobacterial ancestor. The primary function of plastids is photosynthesis so the structure and evolution of plastid genomes are extensively studied in photosynthetic plants. In contrast, little is known about the plastomes of nonphotosynthetic species. In higher plants, plastid genome sequences are available for only three strictly nonphotosynthetic species, the liverwort Aneura mirabilis and two flowering plants, Epifagus virginiana and Rhizanthella gardneri. We report here the complete sequence of a plastid genome of nonphotosynthetic mycoheterotrophic orchid Neottia nidus-avis, determined using 454 pyrosequencing technology. It was found to be reduced in both genome size and gene content; this reduction is however not as drastic as in the other nonphotosynthetic orchid, R. gardneri. Neottia plastome lacks all genes encoding photosynthetic proteins, RNA polymerase subunits but retains most genes of translational apparatus. Those genes that are retained have an increased rate of both synonymous and nonsynonymous substitutions but do not exhibit relaxation of purifying selection either in Neottia or in Rhizanthella

    Genotype–phenotype associations: substitution models to detect evolutionary associations between phenotypic variables and genotypic evolutionary rate

    Get PDF
    Motivation: Mapping between genotype and phenotype is one of the primary goals of evolutionary genetics but one that has received little attention at the interspecies level. Recent developments in phylogenetics and statistical modelling have typically been used to examine molecular and phenotypic evolution separately. We have used this background to develop phylogenetic substitution models to test for associations between evolutionary rate of genotype and phenotype. We do this by creating hybrid rate matrices between genotype and phenotype

    Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints

    Get PDF
    BACKGROUND: We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heuristics for dealing with the computational complexity of the algorithm. RESULTS: We use probabilistic models (pair stochastic context-free grammars, pairSCFGs) as a unifying framework for scoring pairwise alignment and folding. A constrained version of the pairSCFG structural alignment algorithm was developed which assumes knowledge of a few confidently aligned positions (pins). These pins are selected based on the posterior probabilities of a probabilistic pairwise sequence alignment. CONCLUSION: Pairwise RNA structural alignment improves on structure prediction accuracy relative to single sequence folding. Constraining on alignment is a straightforward method of reducing the runtime and memory requirements of the algorithm. Five practical implementations of the pairwise Sankoff algorithm – this work (Consan), David Mathews' Dynalign, Ian Holmes' Stemloc, Ivo Hofacker's PMcomp, and Jan Gorodkin's FOLDALIGN – have comparable overall performance with different strengths and weaknesses

    Genome-Wide Modeling of Transcription Preinitiation Complex Disassembly Mechanisms using ChIP-chip Data

    Get PDF
    Apparent occupancy levels of proteins bound to DNA in vivo can now be routinely measured on a genomic scale. A challenge in relating these occupancy levels to assembly mechanisms that are defined with biochemically isolated components lies in the veracity of assumptions made regarding the in vivo system. Assumptions regarding behavior of molecules in vivo can neither be proven true nor false, and thus is necessarily subjective. Nevertheless, within those confines, connecting in vivo protein-DNA interaction observations with defined biochemical mechanisms is an important step towards fully defining and understanding assembly/disassembly mechanisms in vivo. To this end, we have developed a computational program PathCom that models in vivo protein-DNA occupancy data as biochemical mechanisms under the assumption that occupancy levels can be related to binding duration and explicitly defined assembly/disassembly reactions. We exemplify the process with the assembly of the general transcription factors (TBP, TFIIB, TFIIE, TFIIF, TFIIH, and RNA polymerase II) at the genes of the budding yeast Saccharomyces. Within the assumption inherent in the system our modeling suggests that TBP occupancy at promoters is rather transient compared to other general factors, despite the importance of TBP in nucleating assembly of the preinitiation complex. PathCom is suitable for modeling any assembly/disassembly pathway, given that all the proteins (or species) come together to form a complex

    The atm-1 gene is required for genome stability in Caenorhabditis elegans

    Get PDF
    The Ataxia-telangiectasia-mutated (ATM) gene in humans was identified as the basis of a rare autosomal disorder leading to cancer susceptibility and is now well known as an important signal transducer in response to DNA damage. An approach to understanding the conserved functions of this gene is provided by the model system, Caenorhabditis elegans. In this paper we describe the structure and loss of function phenotype of the ortholog atm-1. Using bioinformatic and molecular analysis we show that the atm-1 gene was previously misannotated. We find that the transcript is in fact a product of three gene predictions, Y48G1BL.2 (atm-1), K10E9.1, and F56C11.4 that together make up the complete coding region of ATM-1. We also characterize animals that are mutant for two available knockout alleles, gk186 and tm5027. As expected, atm-1 mutant animals are sensitive to ionizing radiation. In addition, however, atm-1 mutants also display phenotypes associated with genomic instability, including low brood size, reduced viability and sterility. We document several chromosomal fusions arising from atm-1 mutant animals. This is the first time a mutator phenotype has been described for atm-1 in C. elegans. Finally we demonstrate the use of a balancer system to screen for and capture atm-1-derived mutational events. Our study establishes C. elegans as a model for the study of ATM as a mutator potentially leading to the development of screens to identify therapeutic targets in humans

    Probabilistic Phylogenetic Inference with Insertions and Deletions

    Get PDF
    A fundamental task in sequence analysis is to calculate the probability of a multiple alignment given a phylogenetic tree relating the sequences and an evolutionary model describing how sequences change over time. However, the most widely used phylogenetic models only account for residue substitution events. We describe a probabilistic model of a multiple sequence alignment that accounts for insertion and deletion events in addition to substitutions, given a phylogenetic tree, using a rate matrix augmented by the gap character. Starting from a continuous Markov process, we construct a non-reversible generative (birth–death) evolutionary model for insertions and deletions. The model assumes that insertion and deletion events occur one residue at a time. We apply this model to phylogenetic tree inference by extending the program dnaml in phylip. Using standard benchmarking methods on simulated data and a new “concordance test” benchmark on real ribosomal RNA alignments, we show that the extended program dnamlε improves accuracy relative to the usual approach of ignoring gaps, while retaining the computational efficiency of the Felsenstein peeling algorithm

    Defective Peripheral Nerve Development Is Linked to Abnormal Architecture and Metabolic Activity of Adipose Tissue in Nscl-2 Mutant Mice

    Get PDF
    BACKGROUND: In mammals the interplay between the peripheral nervous system (PNS) and adipose tissue is widely unexplored. We have employed mice, which develop an adult onset of obesity due to the lack the neuronal specific transcription factor Nscl-2 to investigate the interplay between the nervous system and white adipose tissue (WAT). METHODOLOGY: Changes in the architecture and innervation of WAT were compared between wildtype, Nscl2-/-, ob/ob and Nscl2-/-//ob/ob mice using morphological methods, immunohistochemistry and flow cytometry. Metabolic alterations in mutant mice and in isolated cells were investigated under basal and stimulated conditions. PRINCIPAL FINDINGS: We found that Nscl-2 mutant mice show a massive reduction of innervation of white epididymal and paired subcutaneous inguinal fat tissue including sensory and autonomic nerves as demonstrated by peripherin and neurofilament staining. Reduction of innervation went along with defects in the formation of the microvasculature, accumulation of cells of the macrophage/preadipocyte lineage, a bimodal distribution of the size of fat cells, and metabolic defects of isolated adipocytes. Despite a relative insulin resistance of white adipose tissue and isolated Nscl-2 mutant adipocytes the serum level of insulin in Nscl-2 mutant mice was only slightly increased. CONCLUSIONS: We conclude that the reduction of the innervation and vascularization of WAT in Nscl-2 mutant mice leads to the increase of preadipocyte/macrophage-like cells, a bimodal distribution of the size of adipocytes in WAT and an altered metabolic activity of adipocytes
    corecore