37 research outputs found

    Natural genetic engineering: intelligence & design in evolution?

    Get PDF
    There are many things that I like about James Shapiro's new book "Evolution: A View from the 21st Century" (FT Press Science, 2011). He begins the book by saying that it is the creation of novelty, and not selection, that is important in the history of life. In the presence of heritable traits that vary, selection results in the evolution of a population towards an optimal composition of those traits. But selection can only act on changes - and where does this variation come from? Historically, the creation of novelty has been assumed to be the result of random chance or accident. And yet, organisms seem 'designed'. When one examines the data from sequenced genomes, the changes appear NOT to be random or accidental, but one observes that whole chunks of the genome come and go. These 'chunks' often contain functional units, encoding sets of genes that together can perform some specific function. Shapiro argues that what we see in genomes is 'Natural Genetic Engineering', or designed evolution: "Thinking about genomes from an informatics perspective, it is apparent that systems engineering is a better metaphor for the evolutionary process than the conventional view of evolution as a select-biased random walk through limitless space of possible DNA configurations" (page 6)

    CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel type of direct repeat found in a wide range of bacteria and archaea. CRISPRs are beginning to attract attention because of their proposed mechanism; that is, defending their hosts against invading extrachromosomal elements such as viruses. Existing repeat detection tools do a poor job of identifying CRISPRs due to the presence of unique spacer sequences separating the repeats. In this study, a new tool, CRT, is introduced that rapidly and accurately identifies CRISPRs in large DNA strings, such as genomes and metagenomes.</p> <p>Results</p> <p>CRT was compared to CRISPR detection tools, Patscan and Pilercr. In terms of correctness, CRT was shown to be very reliable, demonstrating significant improvements over Patscan for measures precision, recall and quality. When compared to Pilercr, CRT showed improved performance for recall and quality. In terms of speed, CRT proved to be a huge improvement over Patscan. Both CRT and Pilercr were comparable in speed, however CRT was faster for genomes containing large numbers of repeats.</p> <p>Conclusion</p> <p>In this paper a new tool was introduced for the automatic detection of CRISPR elements. This tool, CRT, showed some important improvements over current techniques for CRISPR identification. CRT's approach to detecting repetitive sequences is straightforward. It uses a simple sequential scan of a DNA sequence and detects repeats directly without any major conversion or preprocessing of the input. This leads to a program that is easy to describe and understand; yet it is very accurate, fast and memory efficient, being O(<it>n</it>) in space and O(<it>nm</it>/<it>l</it>) in time.</p

    Defining the Pseudomonas Genus: Where Do We Draw the Line with Azotobacter?

    Get PDF
    The genus Pseudomonas has gone through many taxonomic revisions over the past 100 years, going from a very large and diverse group of bacteria to a smaller, more refined and ordered list having specific properties. The relationship of the Pseudomonas genus to Azotobacter vinelandii is examined using three genomic sequence-based methods. First, using 16S rRNA trees, it is shown that A. vinelandii groups within the Pseudomonas close to Pseudomonas aeruginosa. Genomes from other related organisms (Acinetobacter, Psychrobacter, and Cellvibrio) are outside the Pseudomonas cluster. Second, pan genome family trees based on conserved gene families also show A. vinelandii to be more closely related to Pseudomonas than other related organisms. Third, exhaustive BLAST comparisons demonstrate that the fraction of shared genes between A. vinelandii and Pseudomonas genomes is similar to that of Pseudomonas species with each other. The results of these different methods point to a high similarity between A. vinelandii and the Pseudomonas genus, suggesting that Azotobacter might actually be a Pseudomonas

    The Salmonella enterica Pan-genome

    Get PDF
    Salmonella enterica is divided into four subspecies containing a large number of different serovars, several of which are important zoonotic pathogens and some show a high degree of host specificity or host preference. We compare 45 sequenced S. enterica genomes that are publicly available (22 complete and 23 draft genome sequences). Of these, 35 were found to be of sufficiently good quality to allow a detailed analysis, along with two Escherichia coli strains (K-12 substr. DH10B and the avian pathogenic E. coli (APEC O1) strain). All genomes were subjected to standardized gene finding, and the core and pan-genome of Salmonella were estimated to be around 2,800 and 10,000 gene families, respectively. The constructed pan-genomic dendrograms suggest that gene content is often, but not uniformly correlated to serotype. Any given Salmonella strain has a large stable core, whilst there is an abundance of accessory genes, including the Salmonella pathogenicity islands (SPIs), transposable elements, phages, and plasmid DNA. We visualize conservation in the genomes in relation to chromosomal location and DNA structural features and find that variation in gene content is localized in a selection of variable genomic regions or islands. These include the SPIs but also encompass phage insertion sites and transposable elements. The islands were typically well conserved in several, but not all, isolates—a difference which may have implications in, e.g., host specificity

    Large-scale comparative genomic ranking of taxonomically restricted genes (TRGs) in bacterial and archaeal genomes

    Get PDF
    BACKGROUND: Lineage-specific, or taxonomically restricted genes (TRGs), especially those that are species and strain-specific, are of special interest because they are expected to play a role in defining exclusive ecological adaptations to particular niches. Despite this, they are relatively poorly studied and little understood, in large part because many are still orphans or only have homologues in very closely related isolates. This lack of homology confounds attempts to establish the likelihood that a hypothetical gene is expressed and, if so, to determine the putative function of the protein. METHODOLOGY/PRINCIPAL FINDINGS: We have developed "QIPP" ("Quality Index for Predicted Proteins"), an index that scores the "quality" of a protein based on non-homology-based criteria. QIPP can be used to assign a value between zero and one to any protein based on comparing its features to other proteins in a given genome. We have used QIPP to rank the predicted proteins in the proteomes of Bacteria and Archaea. This ranking reveals that there is a large amount of variation in QIPP scores, and identifies many high-scoring orphans as potentially "authentic" (expressed) orphans. There are significant differences in the distributions of QIPP scores between orphan and non-orphan genes for many genomes and a trend for less well-conserved genes to have lower QIPP scores. CONCLUSIONS: The implication of this work is that QIPP scores can be used to further annotate predicted proteins with information that is independent of homology. Such information can be used to prioritize candidates for further analysis. Data generated for this study can be found in the OrphanMine at http://www.genomics.ceh.ac.uk/orphan_mine

    Experimental Microbial Evolution of Extremophiles

    Get PDF
    Experimental microbial evolutions (EME) involves studying closely a microbial population after it has been through a large number of generations under controlled conditions (Kussell 2013). Adaptive laboratory evolution (ALE) selects for fitness under experimentally imposed conditions (Bennett and Hughes 2009; Dragosits and Mattanovich 2013). However, experimental evolution studies focusing on the contributions of genetic drift and natural mutation rates to evolution are conducted under non-selective conditions to avoid changes imposed by selection (HindrĂ© et al. 2012). To understand the application of experimental evolutionary methods to extremophiles it is essential to consider the recent growth in this field over the last decade using model non-extremophilic microorganisms. This growth reflects both a greater appreciation of the power of experimental evolution for testing evolutionary hypotheses and, especially recently, the new power of genomic methods for analyzing changes in experimentally evolved lineages. Since many crucial processes are driven by microorganisms in nature, it is essential to understand and appreciate how microbial communities function, particularly with relevance to selection. However, many theories developed to understand microbial ecological patterns focus on the distribution and the structure of diversity within a microbial population comprised of single species (Prosser et al. 2007). Therefore an understanding of the concept of species is needed. A common definition of species using a genetic concept is a group of interbreeding individuals that is isolated from other such groups by barriers of recombination (Prosser et al. 2007). An alternative ecological species concept defines a species as set of individuals that can be considered identical in all relevant ecological traits (Cohan 2001). This is particularly important because of the abundance and deep phylogenetic complexity of microbial communities. Cohan postulated that “bacteria occupy discrete niches and that periodic selection will purge genetic variation within each niche without preventing divergence between the inhabitants of different niches”. The importance of gene exchange mechanisms likely in bacteria and archaea and therefore extremophiles, arises from the fact that their genomes are divided into two distinct parts, the core genome and the accessory genome (Cohan 2001). The core genome consists of genes that are crucial for the functioning of an organism and the accessory genome consists of genes that are capable of adapting to the changing ecosystem through gain and loss of function. Strains that belong to the same species can differ in the composition of accessory genes and therefore their capability to adapt to changing ecosystems (Cohan 2001; Tettelin et al. 2005; Gill et al. 2005). Additional ecological diversity exists in plasmids, transposons and pathogenicity islands as they can be easily shared in a favorable environment but still be absent in the same species found elsewhere (Wertz et al. 2003). This poses a major challenge for studying ALE and community microbial ecology indicating a continued need to develop a fitting theory that connects the fluid nature of microbial communities to their ecology (Wertz et al. 2003; Coleman et al. 2006). Understanding the nature and contribution of different processes that determine the frequencies of genes in any population is the biggest concern in population and evolutionary genetics (Prosser et al. 2007) and it is critical for an understanding of experimental evolution

    Metagenomics of the Deep Mediterranean, a Warm Bathypelagic Habitat

    Get PDF
    BACKGROUND: Metagenomics is emerging as a powerful method to study the function and physiology of the unexplored microbial biosphere, and is causing us to re-evaluate basic precepts of microbial ecology and evolution. Most marine metagenomic analyses have been nearly exclusively devoted to photic waters. METHODOLOGY/PRINCIPAL FINDINGS: We constructed a metagenomic fosmid library from 3,000 m-deep Mediterranean plankton, which is much warmer (approximately 14 degrees C) than waters of similar depth in open oceans (approximately 2 degrees C). We analyzed the library both by phylogenetic screening based on 16S rRNA gene amplification from clone pools and by sequencing both insert extremities of ca. 5,000 fosmids. Genome recruitment strategies showed that the majority of high scoring pairs corresponded to genomes from Rhizobiales within the Alphaproteobacteria, Cenarchaeum symbiosum, Planctomycetes, Acidobacteria, Chloroflexi and Gammaproteobacteria. We have found a community structure similar to that found in the aphotic zone of the Pacific. However, the similarities were significantly higher to the mesopelagic (500-700 m deep) in the Pacific than to the single 4000 m deep sample studied at this location. Metabolic genes were mostly related to catabolism, transport and degradation of complex organic molecules, in agreement with a prevalent heterotrophic lifestyle for deep-sea microbes. However, we observed a high percentage of genes encoding dehydrogenases and, among them, cox genes, suggesting that aerobic carbon monoxide oxidation may be important in the deep ocean as an additional energy source. CONCLUSIONS/SIGNIFICANCE: The comparison of metagenomic libraries from the deep Mediterranean and the Pacific ALOHA water column showed that bathypelagic Mediterranean communities resemble more mesopelagic communities in the Pacific, and suggests that, in the absence of light, temperature is a major stratifying factor in the oceanic water column, overriding pressure at least over 4000 m deep. Several chemolithotrophic metabolic pathways could supplement organic matter degradation in this most depleted habitat

    Landscape of somatic mutations in 560 breast cancer whole-genome sequences.

    Get PDF
    We analysed whole-genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. We found that 93 protein-coding cancer genes carried probable driver mutations. Some non-coding regions exhibited high mutation frequencies, but most have distinctive structural features probably causing elevated mutation rates and do not contain driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed twelve base substitution and six rearrangement signatures. Three rearrangement signatures, characterized by tandem duplications or deletions, appear associated with defective homologous-recombination-based DNA repair: one with deficient BRCA1 function, another with deficient BRCA1 or BRCA2 function, the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operating, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer
    corecore