191 research outputs found

    Lateral Transfer of Genes and Gene Fragments in Prokaryotes

    Get PDF
    Lateral genetic transfer (LGT) involves the movement of genetic material from one lineage into another and its subsequent incorporation into the new host genome via genetic recombination. Studies in individual taxa have indicated lateral origins for stretches of DNA of greatly varying length, from a few nucleotides to chromosome size. Here we analyze 1,462 sets of single-copy, putatively orthologous genes from 144 fully sequenced prokaryote genomes, asking to what extent complete genes and fragments of genes have been transferred and recombined in LGT. Using a rigorous phylogenetic approach, we find evidence for LGT in at least 476 (32.6%) of these 1,462 gene sets: 286 (19.6%) clearly show one or more “observable recombination breakpoints” within the boundaries of the open reading frame, while a further 190 (13.0%) yield trees that are topologically incongruent with the reference tree but do not contain a recombination breakpoint within the open reading frame. We refer to these gene sets as observable recombination breakpoint positive (ORB+) and negative (ORB−) respectively. The latter are prima facie instances of lateral transfer of an entire gene or beyond. We observe little functional bias between ORB+ and ORB− gene sets, but find that incorporation of entire genes is potentially more frequent in pathogens than in nonpathogens. As ORB+ gene sets are about 50% more common than ORB− sets in our data, the transfer of gene fragments has been relatively frequent, and the frequency of LGT may have been systematically underestimated in phylogenetic studies

    Recombination in Eukaryotic Single Stranded DNA Viruses

    Get PDF
    Although single stranded (ss) DNA viruses that infect humans and their domesticated animals do not generally cause major diseases, the arthropod borne ssDNA viruses of plants do, and as a result seriously constrain food production in most temperate regions of the world. Besides the well known plant and animal-infecting ssDNA viruses, it has recently become apparent through metagenomic surveys of ssDNA molecules that there also exist large numbers of other diverse ssDNA viruses within almost all terrestrial and aquatic environments. The host ranges of these viruses probably span the tree of life and they are likely to be important components of global ecosystems. Various lines of evidence suggest that a pivotal evolutionary process during the generation of this global ssDNA virus diversity has probably been genetic recombination. High rates of homologous recombination, non-homologous recombination and genome component reassortment are known to occur within and between various different ssDNA virus species and we look here at the various roles that these different types of recombination may play, both in the day-to-day biology, and in the longer term evolution, of these viruses. We specifically focus on the ecological, biochemical and selective factors underlying patterns of genetic exchange detectable amongst the ssDNA viruses and discuss how these should all be considered when assessing the adaptive value of recombination during ssDNA virus evolution

    Prevalence and Evolution of Core Photosystem II Genes in Marine Cyanobacterial Viruses and Their Hosts

    Get PDF
    Cyanophages (cyanobacterial viruses) are important agents of horizontal gene transfer among marine cyanobacteria, the numerically dominant photosynthetic organisms in the oceans. Some cyanophage genomes carry and express host-like photosynthesis genes, presumably to augment the host photosynthetic machinery during infection. To study the prevalence and evolutionary dynamics of this phenomenon, 33 cultured cyanophages of known family and host range and viral DNA from field samples were screened for the presence of two core photosystem reaction center genes, psbA and psbD. Combining this expanded dataset with published data for nine other cyanophages, we found that 88% of the phage genomes contain psbA, and 50% contain both psbA and psbD. The psbA gene was found in all myoviruses and Prochlorococcus podoviruses, but could not be amplified from Prochlorococcus siphoviruses or Synechococcus podoviruses. Nearly all of the phages that encoded both psbA and psbD had broad host ranges. We speculate that the presence or absence of psbA in a phage genome may be determined by the length of the latent period of infection. Whether it also carries psbD may reflect constraints on coupling of viral- and host-encoded PsbA–PsbD in the photosynthetic reaction center across divergent hosts. Phylogenetic clustering patterns of these genes from cultured phages suggest that whole genes have been transferred from host to phage in a discrete number of events over the course of evolution (four for psbA, and two for psbD), followed by horizontal and vertical transfer between cyanophages. Clustering patterns of psbA and psbD from Synechococcus cells were inconsistent with other molecular phylogenetic markers, suggesting genetic exchanges involving Synechococcus lineages. Signatures of intragenic recombination, detected within the cyanophage gene pool as well as between hosts and phages in both directions, support this hypothesis. The analysis of cyanophage psbA and psbD genes from field populations revealed significant sequence diversity, much of which is represented in our cultured isolates. Collectively, these findings show that photosynthesis genes are common in cyanophages and that significant genetic exchanges occur from host to phage, phage to host, and within the phage gene pool. This generates genetic diversity among the phage, which serves as a reservoir for their hosts, and in turn influences photosystem evolution

    Whole-genome sequence analysis for pathogen detection and diagnostics

    Get PDF
    This dissertation focuses on computational methods for improving the accuracy of commonly used nucleic acid tests for pathogen detection and diagnostics. Three specific biomolecular techniques are addressed: polymerase chain reaction, microarray comparative genomic hybridization, and whole-genome sequencing. These methods are potentially the future of diagnostics, but each requires sophisticated computational design or analysis to operate effectively. This dissertation presents novel computational methods that unlock the potential of these diagnostics by efficiently analyzing whole-genome DNA sequences. Improvements in the accuracy and resolution of each of these diagnostic tests promises more effective diagnosis of illness and rapid detection of pathogens in the environment. For designing real-time detection assays, an efficient data structure and search algorithm are presented to identify the most distinguishing sequences of a pathogen that are absent from all other sequenced genomes. Results are presented that show these "signature" sequences can be used to detect pathogens in complex samples and differentiate them from their non-pathogenic, phylogenetic near neighbors. For microarray, novel pan-genomic design and analysis methods are presented for the characterization of unknown microbial isolates. To demonstrate the effectiveness of these methods, pan-genomic arrays are applied to the study of multiple strains of the foodborne pathogen, Listeria monocytogenes, revealing new insights into the diversity and evolution of the species. Finally, multiple methods are presented for the validation of whole-genome sequence assemblies, which are capable of identifying assembly errors in even finished genomes. These validated assemblies provide the ultimate nucleic acid diagnostic, revealing the entire sequence of a genome

    Genome wide evolutionary analyses reveal serotype specific patterns of positive selection in selected Salmonella serotypes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The bacterium <it>Salmonella enterica </it>includes a diversity of serotypes that cause disease in humans and different animal species. Some <it>Salmonella </it>serotypes show a broad host range, some are host restricted and exclusively associated with one particular host, and some are associated with one particular host species, but able to cause disease in other host species and are thus considered "host adapted". Five <it>Salmonella </it>genome sequences, representing a broad host range serotype (Typhimurium), two host restricted serotypes (Typhi [two genomes] and Paratyphi) and one host adapted serotype (Choleraesuis) were used to identify core genome genes that show evidence for recombination and positive selection.</p> <p>Results</p> <p>Overall, 3323 orthologous genes were identified in all 5 <it>Salmonella </it>genomes analyzed. Use of four different methods to assess homologous recombination identified 270 genes that showed evidence for recombination with at least one of these methods (false discovery rate [FDR] <10%). After exclusion of genes with evidence for recombination, site and branch specific models identified 41 genes as showing evidence for positive selection (FDR <20%), including a number of genes with confirmed or likely roles in virulence and <it>ompC</it>, a gene encoding an outer membrane protein, which has also been found to be under positive selection in other bacteria. A total of 8, 16, 7, and 5 genes showed evidence for positive selection in Choleraesuis, Typhi, Typhimurium, and Paratyphi branch analyses, respectively. Sequencing and evolutionary analyses of four genes in an additional 42 isolates representing 23 serotypes confirmed branch specific positive selection and recombination patterns.</p> <p>Conclusion</p> <p>Our data show that, among the four serotypes analyzed, (i) less than 10% of <it>Salmonella </it>genes in the core genome show evidence for homologous recombination, (ii) a number of <it>Salmonella </it>genes are under positive selection, including genes that appear to contribute to virulence, and (iii) branch specific positive selection contributes to the evolution of host restricted <it>Salmonella </it>serotypes.</p
    corecore