43 research outputs found

    Automated alignment-based curation of gene models in filamentous fungi

    Get PDF
    BACKGROUND: Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The wealth of available fungal genomes has not yet been exploited by an automated method that applies quality control of gene models in order to obtain more accurate genome annotations. RESULTS: We provide a novel method named alignment-based fungal gene prediction (ABFGP) that is particularly suitable for plastic genomes like those of fungi. It can assess gene models on a gene-by-gene basis making use of informant gene loci. Its performance was benchmarked on 6,965 gene models confirmed by full-length unigenes from ten different fungi. 79.4% of all gene models were correctly predicted by ABFGP. It improves the output of ab initio gene prediction software due to a higher sensitivity and precision for all gene model components. Applicability of the method was shown by revisiting the annotations of six different fungi, using gene loci from up to 29 fungal genomes as informants. Between 7,231 and 8,337 genes were assessed by ABFGP and for each genome between 1,724 and 3,505 gene model revisions were proposed. The reliability of the proposed gene models is assessed by an a posteriori introspection procedure of each intron and exon in the multiple gene model alignment. The total number and type of proposed gene model revisions in the six fungal genomes is correlated to the quality of the genome assembly, and to sequencing strategies used in the sequencing centre, highlighting different types of errors in different annotation pipelines. The ABFGP method is particularly successful in discovering sequence errors and/or disruptive mutations causing truncated and erroneous gene models. CONCLUSIONS: The ABFGP method is an accurate and fully automated quality control method for fungal gene catalogues that can be easily implemented into existing annotation pipelines. With the exponential release of new genomes, the ABFGP method will help decreasing the number of gene models that require additional manual curation

    High-throughput bioinformatics with the Cyrille2 pipeline system

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or <it>pipelines</it>. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible.</p> <p>Results</p> <p>We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1) a web based, graphical user interface (<it>GUI</it>) that enables a pipeline operator to manage the system; 2) the <it>Scheduler</it>, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3) the <it>Executor</it>, which searches for scheduled jobs and executes these on a compute cluster.</p> <p>Conclusion</p> <p>The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.</p

    The Genomes of the Fungal Plant Pathogens Cladosporium fulvum and Dothistroma septosporum Reveal Adaptation to Different Hosts and Lifestyles But Also Signatures of Common Ancestry.

    Get PDF
    We sequenced and compared the genomes of the Dothideomycete fungal plant pathogensCladosporium fulvum (Cfu) (syn. Passalora fulva) and Dothistroma septosporum (Dse) that are closely related phylogenetically, but have different lifestyles and hosts. Although both fungi grow extracellularly in close contact with host mesophyll cells, Cfu is a biotroph infecting tomato, while Dse is a hemibiotroph infecting pine. The genomes of these fungi have a similar set of genes (70% of gene content in both genomes are homologs), but differ significantly in size (Cfu \u3e61.1-Mb; Dse 31.2-Mb), which is mainly due to the difference in repeat content (47.2% in Cfu versus 3.2% in Dse). Recent adaptation to different lifestyles and hosts is suggested by diverged sets of genes. Cfu contains an α-tomatinase gene that we predict might be required for detoxification of tomatine, while this gene is absent in Dse. Many genes encoding secreted proteins are unique to each species and the repeat-rich areas in Cfu are enriched for these species-specific genes. In contrast, conserved genes suggest common host ancestry. Homologs of Cfu effector genes, including Ecp2 and Avr4, are present in Dse and induce a Cf-Ecp2- and Cf-4-mediated hypersensitive response, respectively. Strikingly, genes involved in production of the toxin dothistromin, a likely virulence factor for Dse, are conserved in Cfu, but their expression differs markedly with essentially no expression by Cfu in planta. Likewise, Cfu has a carbohydrate-degrading enzyme catalog that is more similar to that of necrotrophs or hemibiotrophs and a larger pectinolytic gene arsenal than Dse, but many of these genes are not expressed in planta or are pseudogenized. Overall, comparison of their genomes suggests that these closely related plant pathogens had a common ancestral host but since adapted to different hosts and lifestyles by a combination of differentiated gene content, pseudogenization, and gene regulation

    Finished Genome of the Fungal Wheat Pathogen Mycosphaerella graminicola Reveals Dispensome Structure, Chromosome Plasticity, and Stealth Pathogenesis.

    Get PDF
    The plant-pathogenic fungus Mycosphaerella graminicola (asexual stage: Septoria tritici) causes septoria tritici blotch, a disease that greatly reduces the yield and quality of wheat. This disease is economically important in most wheat-growing areas worldwide and threatens global food production. Control of the disease has been hampered by a limited understanding of the genetic and biochemical bases of pathogenicity, including mechanisms of infection and of resistance in the host. Unlike most other plant pathogens, M. graminicola has a long latent period during which it evades host defenses. Although this type of stealth pathogenicity occurs commonly in Mycosphaerella and other Dothideomycetes, the largest class of plant-pathogenic fungi, its genetic basis is not known. To address this problem, the genome of M. graminicolawas sequenced completely. The finished genome contains 21 chromosomes, eight of which could be lost with no visible effect on the fungus and thus are dispensable. This eight-chromosome dispensome is dynamic in field and progeny isolates, is different from the core genome in gene and repeat content, and appears to have originated by ancient horizontal transfer from an unknown donor. Synteny plots of the M. graminicola chromosomes versus those of the only other sequenced Dothideomycete, Stagonospora nodorum, revealed conservation of gene content but not order or orientation, suggesting a high rate of intra-chromosomal rearrangement in one or both species. This observed “mesosynteny” is very different from synteny seen between other organisms. A surprising feature of the M. graminicolagenome compared to other sequenced plant pathogens was that it contained very few genes for enzymes that break down plant cell walls, which was more similar to endophytes than to pathogens. The stealth pathogenesis of M. graminicola probably involves degradation of proteins rather than carbohydrates to evade host defenses during the biotrophic stage of infection and may have evolved from endophytic ancestors

    At the origin of spliceosomal introns is multiplication of introner-like elements the main mechanism of intron gain in fungi

    Get PDF
    The recent discovery of introner-like elements (ILEs) in six fungal species shed new light on the origin of regular spliceosomal introns (RSIs) and the mechanism of intron gains. These novel spliceosomal introns are found in hundreds of copies, are longer than RSIs and harbor stable predicted secondary structures. Yet, they are prone to degeneration in sequence and length to become undistinguishable from RSIs, suggesting that ILEs are predecessors of most RSIs. In most fungi, other near-identical introns were found duplicated in lower numbers in the same gene or in unrelated genes, indicating that intron duplication is a widespread phenomenon. However, ILEs are associated with the majority of intron gains, suggesting that the other types of duplication are of minor importance to the overall gains of introns. Our data support the hypothesis that ILEs' multiplication corresponds to the main mechanism of intron gain in fungi

    <it>In silico </it>miRNA prediction in metazoan genomes: balancing between sensitivity and specificity

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs), short ~21-nucleotide RNA molecules, play an important role in post-transcriptional regulation of gene expression. The number of known miRNA hairpins registered in the miRBase database is rapidly increasing, but recent reports suggest that many miRNAs with restricted temporal or tissue-specific expression remain undiscovered. Various strategies for <it>in silico </it>miRNA identification have been proposed to facilitate miRNA discovery. Notably support vector machine (SVM) methods have recently gained popularity. However, a drawback of these methods is that they do not provide insight into the biological properties of miRNA sequences.</p> <p>Results</p> <p>We here propose a new strategy for miRNA hairpin prediction in which the likelihood that a genomic hairpin is a true miRNA hairpin is evaluated based on statistical distributions of observed biological variation of properties (descriptors) of known miRNA hairpins. These distributions are transformed into a single and continuous outcome classifier called the <it>L </it>score. Using a dataset of known miRNA hairpins from the miRBase database and an exhaustive set of genomic hairpins identified in the genome of <it>Caenorhabditis elegans</it>, a subset of 18 most informative descriptors was selected after detailed analysis of correlation among and discriminative power of individual descriptors. We show that the majority of previously identified miRNA hairpins have high <it>L </it>scores, that the method outperforms miRNA prediction by threshold filtering and that it is more transparent than SVM classifiers.</p> <p>Conclusion</p> <p>The <it>L </it>score is applicable as a prediction classifier with high sensitivity for novel miRNA hairpins. The <it>L-</it>score approach can be used to rank and select interesting miRNA hairpin candidates for downstream experimental analysis when coupled to a genome-wide set of <it>in silico</it>-identified hairpins or to facilitate the analysis of large sets of putative miRNA hairpin loci obtained in deep-sequencing efforts of small RNAs. Moreover, the in-depth analyses of miRNA hairpins descriptors preceding and determining the <it>L </it>score outcome could be used as an extension to miRBase entries to help increase the reliability and biological relevance of the miRNA registry.</p

    Transcriptome and proteome analyses of proteases in biotroph fungal pathogen Cladosporium fulvum

    No full text
    Proteases are key components of the hydrolytic enzyme arsenal employed by fungal pathogens to invade their host plants. The recent advances in -omics era have facilitated identification of functional proteases involved in plant-fungus interactions. By comparison of the publically available sequences of fungal genomes we found that the number of protease genes present in the genome of Cladosporium fulvum, a biotrophic tomato pathogen, is comparable with that of hemibiotrophs. To identify host plant inducible protease genes and their products, we performed transcriptome and proteome analyses of C. fulvumin vitro and in planta by means of RNA-Seq/RT-qPCR and mass spectrometry. Transcriptome data showed that 14 out of the 59 predicted proteases are expressed during in vitro and in planta growth of C. fulvum, of which nine belong to serine proteases S8 and S10 and the rest belong to metallo- and aspartic proteases. Mass spectrometry confirmed the presence of six proteases at proteome level during plant infection. Expression of limited number of proteases by C. fulvum might sustain biotrophic growth and benefits its stealth pathogenesis

    Understanding the Effectiveness of Genomic Prediction in Tetraploid Potato

    Get PDF
    Use of genomic prediction (GP) in tetraploid is becoming more common. Therefore, we think it is the right time for a comparison of GP models for tetraploid potato. GP models were compared that contrasted shrinkage with variable selection, parametric vs. non-parametric models and different ways of accounting for non-additive genetic effects. As a complement to GP, association studies were carried out in an attempt to understand the differences in prediction accuracy. We compared our GP models on a data set consisting of 147 cultivars, representing worldwide diversity, with over 39 k GBS markers and measurements on four tuber traits collected in six trials at three locations during 2 years. GP accuracies ranged from 0.32 for tuber count to 0.77 for dry matter content. For all traits, differences between GP models that utilised shrinkage penalties and those that performed variable selection were negligible. This was surprising for dry matter, as only a few additive markers explained over 50% of phenotypic variation. Accuracy for tuber count increased from 0.35 to 0.41, when dominance was included in the model. This result is supported by Genome Wide Association Study (GWAS) that found additive and dominance effects accounted for 37% of phenotypic variation, while significant additive effects alone accounted for 14%. For tuber weight, the Reproducing Kernel Hilbert Space (RKHS) model gave a larger improvement in prediction accuracy than explicitly modelling epistatic effects. This is an indication that capturing the between locus epistatic effects of tuber weight can be done more effectively using the semi-parametric RKHS model. Our results show good opportunities for GP in 4x potato

    Functional analysis of the conserved transcriptional regulator CfWor1 in Cladosporium fulvum reveals diverse roles in the virulence of plant pathogenic fungi

    No full text
    Fungal Wor1-like proteins are conserved transcriptional regulators that are reported to regulate the virulence of several plant pathogenic fungi by affecting the expression of virulence genes. Here, we report the functional analysis of CfWor1, the homologue of Wor1 in Cladosporium fulvum. ¿cfwor1 mutants produce sclerotium-like structures and rough hyphae, which are covered with a black extracellular matrix. These mutants do not sporulate and are no longer virulent on tomato. A CE.CfWor1 transformant that constitutively expresses CfWor1 produces fewer spores with altered morphology and is also reduced in virulence. RNA-seq and RT-qrtPCR analyses suggest that reduced virulence of ¿cfwor1 mutants is due to global downregulation of transcription, translation and mitochondrial respiratory chain. The reduced virulence of the CE.CfWor1 transformant is likely due to downregulation of effector genes. Complementation of a non-virulent ¿fosge1 (Wor1-homologue) mutant of Fusarium oxysporum f. sp. lycopersici with CfWor1 restored expression of the SIX effector genes in this fungus, but not its virulence. Chimeric proteins of CfWor1/FoSge1 also only partially restored defects of the ¿fosge1 mutant, suggesting that these transcriptional regulators have functionally diverged. Altogether, our results suggest that CfWor1 primarily regulates development of C.¿fulvum, which indirectly affects the expression of a subset of virulence genes
    corecore