10 research outputs found

    The eukaryotic genome, its reads, and the unfinished assembly

    No full text
    In recent years, readily affordable short read sequences provided by next-generation sequencing (NGS) have become longer and more accurate. This has led to a jump in interest in the utility of NGS-only approaches for exploring eukaryotic genomes. The concept of a static, 'finished' genome assembly, which still appears to be a faraway goal for many eukaryotes, is yielding to new paradigms. We here motivate an object-view concept where the raw reads are the main, fixed object, and assemblies with their annotations take a role of dynamically changing and modifiable views of that object. © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved

    The eukaryotic genome, its reads, and the unfinished assembly

    Get PDF
    In recent years, readily affordable short read sequences provided by next-generation sequencing (NGS) have become longer and more accurate. This has led to a jump in interest in the utility of NGS-only approaches for exploring eukaryotic genomes. The concept of a static, 'finished' genome assembly, which still appears to be a faraway goal for many eukaryotes, is yielding to new paradigms. We here motivate an object-view concept where the raw reads are the main, fixed object, and assemblies with their annotations take a role of dynamically changing and modifiable views of that object. © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved

    The complex task of choosing a de novo assembly: Lessons from fungal genomes

    No full text
    Selecting the values of parameters used by de novo genomic assembly programs, or choosing an optimal de novo assembly from several runs obtained with different parameters or programs, are tasks that can require complex decision-making. A key parameter that must be supplied to typical next generation sequencing (NGS) assemblers is the k-mer length, i.e., the word size that determines which de Bruijn graph the program should map out and use. The topic of assembly selection criteria was recently revisited in the Assemblathon 2 study (Bradnam et al., 2013). Although no clear message was delivered with regard to optimal k-mer lengths, it was shown with examples that it is sometimes important to decide if one is most interested in optimizing the sequences of protein-coding genes (the gene space) or in optimizing the whole genome sequence including the intergenic DNA, as what is best for one criterion may not be best for the other. In the present study, our aim was to better understand how the assembly of unicellular fungi (which are typically intermediate in size and complexity between prokaryotes and metazoan eukaryotes) can change as one varies the k-mer values over a wide range. We used two different de novo assembly programs (SOAPdenovo2 and ABySS), and simple assembly metrics that also focused on success in assembling the gene space and repetitive elements. A recent increase in Illumina read length to around 150 bp allowed us to attempt de novo assemblies with a larger range of k-mers, up to 127 bp. We applied these methods to Illumina paired-end sequencing read sets of fungal strains of Paracoccidioides brasiliensis and other species. By visualizing the results in simple plots, we were able to track the effect of changing k-mer size and assembly program, and to demonstrate how such plots can readily reveal discontinuities or other unexpected characteristics that assembly programs can present in practice, especially when they are used in a traditional molecular microbiology laboratory with a 'genomics corner'. Here we propose and apply a component of a first pass validation methodology for benchmarking and understanding fungal genome de novo assembly processes. © 2014 Elsevier Ltd. All rights reserved

    The complex task of choosing a de novo assembly: Lessons from fungal genomes

    No full text
    Selecting the values of parameters used by de novo genomic assembly programs, or choosing an optimal de novo assembly from several runs obtained with different parameters or programs, are tasks that can require complex decision-making. A key parameter that must be supplied to typical next generation sequencing (NGS) assemblers is the k-mer length, i.e., the word size that determines which de Bruijn graph the program should map out and use. The topic of assembly selection criteria was recently revisited in the Assemblathon 2 study (Bradnam et al., 2013). Although no clear message was delivered with regard to optimal k-mer lengths, it was shown with examples that it is sometimes important to decide if one is most interested in optimizing the sequences of protein-coding genes (the gene space) or in optimizing the whole genome sequence including the intergenic DNA, as what is best for one criterion may not be best for the other. In the present study, our aim was to better understand how the assembly of unicellular fungi (which are typically intermediate in size and complexity between prokaryotes and metazoan eukaryotes) can change as one varies the k-mer values over a wide range. We used two different de novo assembly programs (SOAPdenovo2 and ABySS), and simple assembly metrics that also focused on success in assembling the gene space and repetitive elements. A recent increase in Illumina read length to around 150 bp allowed us to attempt de novo assemblies with a larger range of k-mers, up to 127 bp. We applied these methods to Illumina paired-end sequencing read sets of fungal strains of Paracoccidioides brasiliensis and other species. By visualizing the results in simple plots, we were able to track the effect of changing k-mer size and assembly program, and to demonstrate how such plots can readily reveal discontinuities or other unexpected characteristics that assembly programs can present in practice, especially when they are used in a traditional molecular microbiology laboratory with a 'genomics corner'. Here we propose and apply a component of a first pass validation methodology for benchmarking and understanding fungal genome de novo assembly processes. © 2014 Elsevier Ltd. All rights reserved

    The eukaryotic genome, its reads, and the unfinished assembly

    Get PDF
    AbstractIn recent years, readily affordable short read sequences provided by next-generation sequencing (NGS) have become longer and more accurate. This has led to a jump in interest in the utility of NGS-only approaches for exploring eukaryotic genomes. The concept of a static, ‘finished’ genome assembly, which still appears to be a faraway goal for many eukaryotes, is yielding to new paradigms. We here motivate an object-view concept where the raw reads are the main, fixed object, and assemblies with their annotations take a role of dynamically changing and modifiable views of that object

    Draft genome sequences of two Sporothrix schenckii clinical isolates associated with human sporotrichosis in Colombia

    No full text
    Sporothrix schenckii is a thermodimorphic fungal pathogen with a high genetic diversity. In this work, we present the assembly and similarity analysis of the whole-genome sequences of two clinical isolates from Colombia of S. schenckii sensu stricto. © 2018 Gomez et al

    Draft genome sequences of two Sporothrix schenckii clinical isolates associated with human sporotrichosis in Colombia

    No full text
    Sporothrix schenckii is a thermodimorphic fungal pathogen with a high genetic diversity. In this work, we present the assembly and similarity analysis of the whole-genome sequences of two clinical isolates from Colombia of S. schenckii sensu stricto. © 2018 Gomez et al
    corecore