4 research outputs found

    The reach of the genome signature in prokaryotes

    Get PDF
    BACKGROUND: With the increased availability of sequenced genomes there have been several initiatives to infer evolutionary relationships by whole genome characteristics. One of these studies suggested good congruence between genome synteny, shared gene content, 16S ribosomal DNA identity, codon usage and the genome signature in prokaryotes. Here we rigorously test the phylogenetic signal of the genome signature, which consists of the genome-specific relative frequencies of dinucleotides, on 334 sequenced prokaryotic genome sequences. RESULTS: Intrageneric comparisons show that in general the genomic dissimilarity scores are higher than in intraspecific comparisons, in accordance with the suggested phylogenetic signal of the genome signature. Exceptions to this trend, (Bartonella spp., Bordetella spp., Salmonella spp. and Yersinia spp.), which have low average intrageneric genomic dissimilarity scores, suggest that members of these genera might be considered the same species. On the other hand, high genomic dissimilarity values for intraspecific analyses suggest that in some cases (e.g.Prochlorococcus marinus, Pseudomonas fluorescens, Buchnera aphidicola and Rhodopseudomonas palustris) different strains from the same species may actually represent different species. Comparing 16S rDNA identity with genomic dissimilarity values corroborates the previously suggested trend in phylogenetic signal, albeit that the dissimilarity values only provide low resolution. CONCLUSION: The genome signature has a distinct phylogenetic signal, independent of individual genetic marker genes. A reliable phylogenetic clustering cannot be based on dissimilarity values alone, as bootstrapping is not possible for this parameter. It can however be used to support or refute a given phylogeny and resulting taxonomy

    Compositional discordance between prokaryotic plasmids and host chromosomes

    Get PDF
    BACKGROUND: Most plasmids depend on the host replication machinery and possess partitioning genes. These properties confine plasmids to a limited range of hosts, yielding a close and presumably stable relationship between plasmid and host. Hence, it is anticipated that due to amelioration the dinucleotide composition of plasmids is similar to that of the genome of their hosts. However, plasmids are also thought to play a major role in horizontal gene transfer and thus are frequently exchanged between hosts, suggesting dinucleotide composition dissimilarity between plasmid and host genome. We compared the dinucleotide composition of a large collection of plasmids with that of their host genomes to shed more light on this enigma. RESULTS: The dinucleotide frequency, coined the genome signature, facilitates the identification of putative horizontally transferred DNA in complete genome sequences, since it was found to be typical for a certain genome, and similar between related species. By comparison of the genome signature of 230 plasmid sequences with that of the genome of each respective host, we found that in general the genome signature of plasmids is dissimilar from that of their host genome. CONCLUSION: Our results show that the genome signature of plasmids does not resemble that of their host genome. This indicates either absence of amelioration or a less stable relationship between plasmids and their host. We propose an indiscriminate lifestyle for plasmids preserving the genome signature discordance between these episomes and host chromosomes

    Initial steps towards a production platform for DNA sequence analysis on the grid

    Get PDF
    ABSTRACT: BACKGROUND: Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. RESULTS: In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was signicantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations. All components are open source and can be transported to other grid infrastructures. CONCLUSIONS: The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl
    corecore