236 research outputs found

    Genome Update. Let the consumer beware: Streptomyces genome sequence quality.

    Get PDF
    A genome sequence assembly represents a model of a genome. This article explores some tools and methods for assessing the quality of an assembly, using publicly available data for Streptomyces species as the example. There is great variability in quality of assemblies deposited in GenBank. Only in a small minority of these assemblies are the raw data available, enabling full appraisal of the assembly quality

    Some (bacilli) like it hot: genomics of Geobacillus species.

    Get PDF
    n/aBiotechnology and Biological Sciences Research Council (BBSRC). Grant Numbers: BB/H016120/1, BB/I024631/1, BB/I025956/1, BB/K003240/2, BB/L012499/

    Recently published Streptomyces genome sequences

    Get PDF
    This is the final version of the article. Available from Wiley via the DOI in this record.ntroductionMany readers of this journal will need no introduction tothe bacterial genusStreptomyces, which includes severalhundred species, many of which produce biotechnolo-gically useful secondary metabolites. The last 2 yearshave seen numerous publications describingStrepto-mycesgenome sequences (Table 1), mostly as shortgenome announcements restricted to just 500 wordsand therefore allowing little description and analysis. Ouraim in this current manuscript is to survey these recentpublications and to dig a little deeper where appro-priate. The genusStreptomyces is now one of the mosthighly sequenced, with 19 finished genomic sequences(Table 2) and a further 125 draft assemblies available inthe GenBank database as of 3rd of May 2014; by the timethis is published, no doubt there will be more. The reasonsgiven for sequencing this latest crop ofStreptomycesinclude production of industrially important enzymes, deg-radation of lignin, antibiotic production, rapidJames Harrison was supported by a PhD studentship from the Biotechnology and Biological Sciences Research Council

    Draft Genome Sequences of Two Strains of Xanthomonas arboricola pv. celebensis Isolated from Banana Plants.

    Get PDF
    Published onlineWe report here the annotated draft genome sequences of strains Xanthomonas arboricola pv. celebensis NCPPB 1832 and NCPPB 1630 (NCPPB, National Collection of Plant Pathogenic Bacteria), both isolated from Musa species in New Zealand. This will allow the comparison of genomes between phylogenetically distant xanthomonads that have independently converged with the ability to colonize banana plants.Biotechnology and Biological Sciences Research Council (BBSRC) provided funding to James Harrison. James Harrison was supported by a Ph.D. studentship from the Biotechnology and Biological Sciences Research Council (BBSRC). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication

    Finding sRNA generative locales from high-throughput sequencing data with NiBLS.

    Get PDF
    Journal ArticleCopyright © 2010 MacLean et al; licensee BioMed Central Ltd.BACKGROUND: Next-generation sequencing technologies allow researchers to obtain millions of sequence reads in a single experiment. One important use of the technology is the sequencing of small non-coding regulatory RNAs and the identification of the genomic locales from which they originate. Currently, there is a paucity of methods for finding small RNA generative locales. RESULTS: We describe and implement an algorithm that can determine small RNA generative locales from high-throughput sequencing data. The algorithm creates a network, or graph, of the small RNAs by creating links between them depending on their proximity on the target genome. For each of the sub-networks in the resulting graph the clustering coefficient, a measure of the interconnectedness of the subnetwork, is used to identify the generative locales. We test the algorithm over a wide range of parameters using RFAM sequences as positive controls and demonstrate that the algorithm has good sensitivity and specificity in a range of Arabidopsis and mouse small RNA sequence sets and that the locales it generates are robust to differences in the choice of parameters. CONCLUSIONS: NiBLS is a fast, reliable and sensitive method for determining small RNA locales in high-throughput sequence data that is generally applicable to all classes of small RNA.Gatsby Charitable Foundatio

    Protein domains and architectural innovation in plant-associated Proteobacteria.

    Get PDF
    Journal ArticleResearch Support, Non-U.S. Gov'tCopyright © 2005 Studholme et al; licensee BioMed Central Ltd.BACKGROUND: Evolution of new complex biological behaviour tends to arise by novel combinations of existing building blocks. The functional and evolutionary building blocks of the proteome are protein domains, the function of a protein being dependent on its constituent domains. We clustered completely-sequenced proteomes of prokaryotes on the basis of their protein domain content, as defined by Pfam (release 16.0). This revealed that, although there was a correlation between phylogeny and domain content, other factors also have an influence. This observation motivated an investigation of the relationship between an organism's lifestyle and the complement of domains and domain architectures found within its proteome. RESULTS: We took a census of all protein domains and domain combinations (architectures) encoded in the completely-sequenced proteobacterial genomes. Nine protein domain families were identified that are found in phylogenetically disparate plant-associated bacteria but are absent from non-plant-associated bacteria. Most of these are known to play a role in the plant-associated lifestyle, but they also included domain of unknown function DUF1427, which is found in plant symbionts and pathogens of the alpha-, beta- and gamma-Proteobacteria, but not known in any other organism. Further, several domains were identified as being restricted to phytobacteria and Eukaryotes. One example is the RolB/RolC glucosidase family, which is found only in Agrobacterium species and in plants. We identified the 0.5% of Pfam protein domain families that were most significantly over-represented in the plant-associated Proteobacteria with respect to the background frequencies in the whole set of available proteobacterial proteomes. These included guanylate cyclase, domains implicated in aromatic catabolism, cellulase and several domains of unknown function. We identified 459 unique domain architectures found in phylogenetically diverse plant pathogens and symbionts that were absent from non-pathogenic and non-symbiotic relatives. The vast majority of these were restricted to a single species or several closely related species and so their distributions could be better explained by phylogeny than by lifestyle. However, several architectures were found in two or more very distantly related phytobacteria but absent from non-plant-associated bacteria. Many of the proteins with these unique architectures are predicted to be secreted. In Pseudomonas syringae pathovar tomato, those genes encoding genes with novel domain architectures tended to have atypical GC contents and were adjacent to insertion sequence elements and phage-like sequences, suggesting acquisition by horizontal transfer. CONCLUSIONS: By identifying domains and architectures unique to plant pathogens and symbionts, we highlighted candidate proteins for involvement in plant-associated bacterial lifestyles. Given that characterisation of novel gene products in vivo and in vitro is time-consuming and expensive, this computational approach may be useful for reducing experimental search space. Furthermore we discuss the biological significance of novel proteins highlighted by this study in the context of plant-associated lifestyles.Gatsby Charitable Foundatio

    Draft Genome Sequence of Pseudomonas syringae pv. syringae ALF3 Isolated from Alfalfa.

    Get PDF
    Published onlineWe report here the annotated draft genome sequence of Pseudomonas syringae pv. syringae strain ALF3, isolated in Wyoming. A comparison of this genome sequence with those of closely related strains of P. syringae adapted to other hosts will facilitate research into interactions between this pathogen and alfalfa.Biotechnology and Biological Sciences Research Council (BBSRC) provided funding to James Harrison. Funding was also provided by USDA-ARS CRIS project 5062-12210- 002-00D

    Draft genome sequences of pathotype strains for three pathovars belonging to three xanthomonas species

    Get PDF
    This is the final version. Available from American Society for Microbiology via the DOI in this recordWe present here the draft genome sequences of type/pathotype strains for three Xanthomonas species and pathovars with different host specificities, the Hedera helix L. pathogen Xanthomonas hortorum pv. hederae WHRI 7744 (NCPPB 939T), the rice pathogen X. oryzae pv. oryzicola WHRI 5234 (NCPPB 1585), and the cotton pathogen X. citri subsp. malvacearum WHRI 5232 (NCPPB 633)

    Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects

    Get PDF
    This is the final version of the article. Available from Nature Publishing Group via the DOI in this record.Sequence alignments form the basis for many comparative and population genomic studies. Alignment tools provide a range of accuracies dependent on the divergence between the sequences and the alignment methods. Despite widespread use, there is no standard method for assessing the accuracy of a dataset and alignment strategy after resequencing. We present a framework and tool for determining the overall accuracies of an input read dataset, alignment and SNP-calling method providing an isolate in that dataset has a corresponding, or closely related reference sequence available. In addition to this tool for comparing False Discovery Rates (FDR), we include a method for determining homozygous and heterozygous positions from an alignment using binomial probabilities for an expected error rate. We benchmark this method against other SNP callers using our FDR method with three fungal genomes, finding that it was able achieve a high level of accuracy. These tools are available at http://cfdr.sourceforge.net/.R.A.F. was funded by the Natural Environment Research Council (NERC). D.A.H. and M.C.F. were supported by the Wellcome Trust. No additional external funding received for this study

    A highly specific tool for identification of Xanthomonas vasicola pv. musacearum based on five Xvm-specific coding sequences

    Get PDF
    This is the final version. Available on open access from Elsevier via the DOI in this recordXanthomonas vasicola pv. musacearum (Xvm) is a bacterial pathogen responsible for the economically important Xanthomonas wilt disease on banana and enset crops in Sub-Saharan Africa. Given that the symptoms are similar to those of other diseases, molecular diagnosis is essential to unambiguously identify this pathogen and distinguish it from closely related strains not pathogenic on these hosts. Currently, Xvm identification is based on polymerase chain reaction (PCR) with GspDm primers, targeting the gene encoding general secretory protein D. Experimental results and examination of genomic sequences revealed poor specificity of the GspDm PCR. Here, we present and validate five new Xvm-specific primers amplifying only Xvm strains.Agropolis Fondatio
    corecore