113 research outputs found

    Prodigal: prokaryotic gene recognition and translation initiation site identification

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The quality of automated gene prediction in microbial organisms has improved steadily over the past decade, but there is still room for improvement. Increasing the number of correct identifications, both of genes and of the translation initiation sites for each gene, and reducing the overall number of false positives, are all desirable goals.</p> <p>Results</p> <p>With our years of experience in manually curating genomes for the Joint Genome Institute, we developed a new gene prediction algorithm called Prodigal (PROkaryotic DYnamic programming Gene-finding ALgorithm). With Prodigal, we focused specifically on the three goals of improved gene structure prediction, improved translation initiation site recognition, and reduced false positives. We compared the results of Prodigal to existing gene-finding methods to demonstrate that it met each of these objectives.</p> <p>Conclusion</p> <p>We built a fast, lightweight, open source gene prediction program called Prodigal <url>http://compbio.ornl.gov/prodigal/</url>. Prodigal achieved good results compared to existing methods, and we believe it will be a valuable asset to automated microbial annotation pipelines.</p

    A Novel Secretory Poly-Cysteine and Histidine-Tailed Metalloprotein (Ts-PCHTP) from Trichinella spiralis (Nematoda)

    Get PDF
    BACKGROUND: Trichinella spiralis is an unusual parasitic intracellular nematode causing dedifferentiation of the host myofiber. Trichinella proteomic analyses have identified proteins that act at the interface between the parasite and the host and are probably important for the infection and pathogenesis. Many parasitic proteins, including a number of metalloproteins are unique for the nematodes and trichinellids and therefore present good targets for future therapeutic developments. Furthermore, detailed information on such proteins and their function in the nematode organism would provide better understanding of the parasite-host interactions. METHODOLOGY/PRINCIPAL FINDINGS: In this study we report the identification, biochemical characterization and localization of a novel poly-cysteine and histidine-tailed metalloprotein (Ts-PCHTP). The native Ts-PCHTP was purified from T. spiralis muscle larvae that were isolated from infected rats as a model system. The sequence analysis showed no homology with other proteins. Two unique poly-cysteine domains were found in the amino acid sequence of Ts-PCHTP. This protein is also the first reported natural histidine tailed protein. It was suggested that Ts-PCHTP has metal binding properties. Total Reflection X-ray Fluorescence (TXRF) assay revealed that it binds significant concentrations of iron, nickel and zinc at protein:metal ratio of about 1:2. Immunohistochemical analysis showed that the Ts-PCHTP is localized in the cuticle and in all tissues of the larvae, but that it is not excreted outside the parasite. CONCLUSIONS/SIGNIFICANCE: Our data suggest that Ts-PCHTP is the first described member of a novel nematode poly-cysteine protein family and its function could be metal storage and/or transport. Since this protein family is unique for parasites from Superfamily Trichinelloidea its potential applications in diagnostics and treatment could be exploited in future

    Widespread Over-Expression of the X Chromosome in Sterile F1 Hybrid Mice

    Get PDF
    The X chromosome often plays a central role in hybrid male sterility between species, but it is unclear if this reflects underlying regulatory incompatibilities. Here we combine phenotypic data with genome-wide expression data to directly associate aberrant expression patterns with hybrid male sterility between two species of mice. We used a reciprocal cross in which F1 males are sterile in one direction and fertile in the other direction, allowing us to associate expression differences with sterility rather than with other hybrid phenotypes. We found evidence of extensive over-expression of the X chromosome during spermatogenesis in sterile but not in fertile F1 hybrid males. Over-expression was most pronounced in genes that are normally expressed after meiosis, consistent with an X chromosome-wide disruption of expression during the later stages of spermatogenesis. This pattern was not a simple consequence of faster evolutionary divergence on the X chromosome, because X-linked expression was highly conserved between the two species. Thus, transcriptional regulation of the X chromosome during spermatogenesis appears particularly sensitive to evolutionary divergence between species. Overall, these data provide evidence for an underlying regulatory basis to reproductive isolation in house mice and underscore the importance of transcriptional regulation of the X chromosome to the evolution of hybrid male sterility

    HMMerThread: Detecting Remote, Functional Conserved Domains in Entire Genomes by Combining Relaxed Sequence-Database Searches with Fold Recognition

    Get PDF
    Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de

    Cholera- and Anthrax-Like Toxins Are among Several New ADP-Ribosyltransferases

    Get PDF
    Chelt, a cholera-like toxin from Vibrio cholerae, and Certhrax, an anthrax-like toxin from Bacillus cereus, are among six new bacterial protein toxins we identified and characterized using in silico and cell-based techniques. We also uncovered medically relevant toxins from Mycobacterium avium and Enterococcus faecalis. We found agriculturally relevant toxins in Photorhabdus luminescens and Vibrio splendidus. These toxins belong to the ADP-ribosyltransferase family that has conserved structure despite low sequence identity. Therefore, our search for new toxins combined fold recognition with rules for filtering sequences – including a primary sequence pattern – to reduce reliance on sequence identity and identify toxins using structure. We used computers to build models and analyzed each new toxin to understand features including: structure, secretion, cell entry, activation, NAD+ substrate binding, intracellular target binding and the reaction mechanism. We confirmed activity using a yeast growth test. In this era where an expanding protein structure library complements abundant protein sequence data – and we need high-throughput validation – our approach provides insight into the newest toxin ADP-ribosyltransferases
    • …
    corecore