112 research outputs found

    Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments

    Get PDF
    EVidenceModeler (EVM) is an automated annotation tool that predicts protein-coding regions, alternatively spliced transcripts and untranslated regions of eukaryotic genes

    Transcriptional Regulation of Chemical Diversity in Aspergillus fumigatus by LaeA

    Get PDF
    Secondary metabolites, including toxins and melanins, have been implicated as virulence attributes in invasive aspergillosis. Although not definitively proved, this supposition is supported by the decreased virulence of an Aspergillus fumigatus strain, ΔlaeA, that is crippled in the production of numerous secondary metabolites. However, loss of a single LaeA-regulated toxin, gliotoxin, did not recapitulate the hypovirulent ΔlaeA pathotype, thus implicating other toxins whose production is governed by LaeA. Toward this end, a whole-genome comparison of the transcriptional profile of wild-type, ΔlaeA, and complemented control strains showed that genes in 13 of 22 secondary metabolite gene clusters, including several A. fumigatus–specific mycotoxin clusters, were expressed at significantly lower levels in the ΔlaeA mutant. LaeA influences the expression of at least 9.5% of the genome (943 of 9,626 genes in A. fumigatus) but positively controls expression of 20% to 40% of major classes of secondary metabolite biosynthesis genes such as nonribosomal peptide synthetases (NRPSs), polyketide synthases, and P450 monooxygenases. Tight regulation of NRPS-encoding genes was highlighted by quantitative real-time reverse-transcription PCR analysis. In addition, expression of a putative siderophore biosynthesis NRPS (NRPS2/sidE) was greatly reduced in the ΔlaeA mutant in comparison to controls under inducing iron-deficient conditions. Comparative genomic analysis showed that A. fumigatus secondary metabolite gene clusters constitute evolutionarily diverse regions that may be important for niche adaptation and virulence attributes. Our findings suggest that LaeA is a novel target for comprehensive modification of chemical diversity and pathogenicity

    Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release

    Get PDF
    BACKGROUND: Since the initial publication of its complete genome sequence, Arabidopsis thaliana has become more important than ever as a model for plant research. However, the initial genome annotation was submitted by multiple centers using inconsistent methods, making the data difficult to use for many applications. RESULTS: Over the course of three years, TIGR has completed its effort to standardize the structural and functional annotation of the Arabidopsis genome. Using both manual and automated methods, Arabidopsis gene structures were refined and gene products were renamed and assigned to Gene Ontology categories. We present an overview of the methods employed, tools developed, and protocols followed, summarizing the contents of each data release with special emphasis on our final annotation release (version 5). CONCLUSION: Over the entire period, several thousand new genes and pseudogenes were added to the annotation. Approximately one third of the originally annotated gene models were significantly refined yielding improved gene structure annotations, and every protein-coding gene was manually inspected and classified using Gene Ontology terms

    The Aspergillus Genome Database, a curated comparative genomics resource for gene, protein and sequence information for the Aspergillus research community

    Get PDF
    The Aspergillus Genome Database (AspGD) is an online genomics resource for researchers studying the genetics and molecular biology of the Aspergilli. AspGD combines high-quality manual curation of the experimental scientific literature examining the genetics and molecular biology of Aspergilli, cutting-edge comparative genomics approaches to iteratively refine and improve structural gene annotations across multiple Aspergillus species, and web-based research tools for accessing and exploring the data. All of these data are freely available at http://www.aspgd.org. We welcome feedback from users and the research community at [email protected]

    Comparative Genomics of Recent Shiga Toxin-Producing Escherichia coli O104:H4: Short-Term Evolution of an Emerging Pathogen

    Get PDF
    The large outbreak of diarrhea and hemolytic uremic syndrome (HUS) caused by Shiga toxin-producing Escherichia coli O104:H4 in Europe from May to July 2011 highlighted the potential of a rarely identified E. coli serogroup to cause severe disease. Prior to the outbreak, there were very few reports of disease caused by this pathogen and thus little known of its diversity and evolution. The identification of cases of HUS caused by E. coli O104:H4 in France and Turkey after the outbreak and with no clear epidemiological links raises questions about whether these sporadic cases are derived from the outbreak. Here, we report genome sequences of five independent isolates from these cases and results of a comparative analysis with historical and 2011 outbreak isolates. These analyses revealed that the five isolates are not derived from the outbreak strain; however, they are more closely related to the outbreak strain and each other than to isolates identified prior to the 2011 outbreak. Over the short time scale represented by these closely related organisms, the majority of genome variation is found within their mobile genetic elements: none of the nine O104:H4 isolates compared here contain the same set of plasmids, and their prophages and genomic islands also differ. Moreover, the presence of closely related HUS-associated E. coli O104:H4 isolates supports the contention that fully virulent O104:H4 isolates are widespread and emphasizes the possibility of future food-borne E. coli O104:H4 outbreaks

    Comparative Genomics of Vancomycin-Resistant Staphylococcus aureus Strains and Their Positions within the Clade Most Commonly Associated with Methicillin-Resistant S. aureus Hospital-Acquired Infection in the United States

    Get PDF
    Methicillin-resistant Staphylococcus aureus (MRSA) strains are leading causes of hospital-acquired infections in the United States, and clonal cluster 5 (CC5) is the predominant lineage responsible for these infections. Since 2002, there have been 12 cases of vancomycin-resistant S. aureus (VRSA) infection in the United States—all CC5 strains. To understand this genetic background and what distinguishes it from other lineages, we generated and analyzed high-quality draft genome sequences for all available VRSA strains. Sequence comparisons show unambiguously that each strain independently acquired Tn1546 and that all VRSA strains last shared a common ancestor over 50 years ago, well before the occurrence of vancomycin resistance in this species. In contrast to existing hypotheses on what predisposes this lineage to acquire Tn1546, the barrier posed by restriction systems appears to be intact in most VRSA strains. However, VRSA (and other CC5) strains were found to possess a constellation of traits that appears to be optimized for proliferation in precisely the types of polymicrobic infection where transfer could occur. They lack a bacteriocin operon that would be predicted to limit the occurrence of non-CC5 strains in mixed infection and harbor a cluster of unique superantigens and lipoproteins to confound host immunity. A frameshift in dprA, which in other microbes influences uptake of foreign DNA, may also make this lineage conducive to foreign DNA acquisition

    Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Tetrahymena thermophila</it>, a widely studied model for cellular and molecular biology, is a binucleated single-celled organism with a germline micronucleus (MIC) and somatic macronucleus (MAC). The recent draft MAC genome assembly revealed low sequence repetitiveness, a result of the epigenetic removal of invasive DNA elements found only in the MIC genome. Such low repetitiveness makes complete closure of the MAC genome a feasible goal, which to achieve would require standard closure methods as well as removal of minor MIC contamination of the MAC genome assembly. Highly accurate preliminary annotation of <it>Tetrahymena</it>'s coding potential was hindered by the lack of both comparative genomic sequence information from close relatives and significant amounts of cDNA evidence, thus limiting the value of the genomic information and also leaving unanswered certain questions, such as the frequency of alternative splicing.</p> <p>Results</p> <p>We addressed the problem of MIC contamination using comparative genomic hybridization with purified MIC and MAC DNA probes against a whole genome oligonucleotide microarray, allowing the identification of 763 genome scaffolds likely to contain MIC-limited DNA sequences. We also employed standard genome closure methods to essentially finish over 60% of the MAC genome. For the improvement of annotation, we have sequenced and analyzed over 60,000 verified EST reads from a variety of cellular growth and development conditions. Using this EST evidence, a combination of automated and manual reannotation efforts led to updates that affect 16% of the current protein-coding gene models. By comparing EST abundance, many genes showing apparent differential expression between these conditions were identified. Rare instances of alternative splicing and uses of the non-standard amino acid selenocysteine were also identified.</p> <p>Conclusion</p> <p>We report here significant progress in genome closure and reannotation of <it>Tetrahymena thermophila</it>. Our experience to date suggests that complete closure of the MAC genome is attainable. Using the new EST evidence, automated and manual curation has resulted in substantial improvements to the over 24,000 gene models, which will be valuable to researchers studying this model organism as well as for comparative genomics purposes.</p

    New resources for functional analysis of omics data for the genus Aspergillus

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Detailed and comprehensive genome annotation can be considered a prerequisite for effective analysis and interpretation of omics data. As such, Gene Ontology (GO) annotation has become a well accepted framework for functional annotation. The genus <it>Aspergillus </it>comprises fungal species that are important model organisms, plant and human pathogens as well as industrial workhorses. However, GO annotation based on both computational predictions and extended manual curation has so far only been available for one of its species, namely <it>A. nidulans</it>.</p> <p>Results</p> <p>Based on protein homology, we mapped 97% of the 3,498 GO annotated <it>A. nidulans </it>genes to at least one of seven other <it>Aspergillus </it>species: <it>A. niger</it>, <it>A. fumigatus</it>, <it>A. flavus</it>, <it>A. clavatus</it>, <it>A. terreus</it>, <it>A. oryzae </it>and <it>Neosartorya fischeri</it>. GO annotation files compatible with diverse publicly available tools have been generated and deposited online. To further improve their accessibility, we developed a web application for GO enrichment analysis named FetGOat and integrated GO annotations for all <it>Aspergillus </it>species with public genome sequences. Both the annotation files and the web application FetGOat are accessible via the Broad Institute's website (<url>http://www.broadinstitute.org/fetgoat/index.html</url>). To demonstrate the value of those new resources for functional analysis of omics data for the genus <it>Aspergillus</it>, we performed two case studies analyzing microarray data recently published for <it>A. nidulans</it>, <it>A. niger </it>and <it>A. oryzae</it>.</p> <p>Conclusions</p> <p>We mapped <it>A. nidulans </it>GO annotation to seven other <it>Aspergilli</it>. By depositing the newly mapped GO annotation online as well as integrating it into the web tool FetGOat, we provide new, valuable and easily accessible resources for omics data analysis and interpretation for the genus <it>Aspergillus</it>. Furthermore, we have given a general example of how a well annotated genome can help improving GO annotation of related species to subsequently facilitate the interpretation of omics data.</p