62 research outputs found

    Gene Network Visualization and Quantitative Synteny Analysis of more than 300 Marine T4-Like Phage Scaffolds from the GOS Metagenome

    Get PDF
    Bacteriophages (phages) are the most abundant biological entities in the biosphere and are the dominant “organisms” in marine environments, exerting an enormous influence on marine microbial populations. Metagenomic projects, such as the Global Ocean Sampling expedition (GOS), have demonstrated the predominance of tailed phages (Caudovirales), particularly T4 superfamily cyanophages (Cyano-T4s), in the marine milieu. Whereas previous metagenomic analyses were limited to gene content information, here we present a comparative analysis of over 300 phage scaffolds assembled from the viral fraction of the GOS data. This assembly permits the examination of synteny (organization) of the genes on the scaffolds and their comparison with the genome sequences from cultured Cyano-T4s. We employ comparative genomics and a novel usage of network visualization software to show that the scaffold phylogenies are similar to those of the traditional marker genes they contain. Importantly, these uncultured metagenomic scaffolds quite closely match the organization of the “core genome” of the known Cyano-T4s. This indicates that the current view of genome architecture in the Cyano-T4s is not seriously biased by being based on a small number of cultured phages, and we can be confident that they accurately reflect the diverse population of such viruses in marine surface waters

    Mobile Regulatory Cassettes Mediate Modular Shuffling in T4-Type Phage Genomes

    Get PDF
    Coliphage phi1, which was isolated for phage therapy in the Republic of Georgia, is closely related to the T-like myovirus RB49. The ∼275 open reading frames encoded by each phage have an average level of amino acid identity of 95.8%. RB49 lacks 7 phi1 genes while 10 phi1 genes are missing from RB49. Most of these unique genes encode functions without known homologs. Many of the insertion, deletion, and replacement events that distinguish the two phages are in the hyperplastic regions (HPRs) of their genomes. The HPRs are rich in both nonessential genes and small regulatory cassettes (promoterearly stem-loops [PeSLs]) composed of strong σ70-like promoters and stem-loop structures, which are effective transcription terminators. Modular shuffling mediated by recombination between PeSLs has caused much of the sequence divergence between RB49 and phi1. We show that exchanges between nearby PeSLs can also create small circular DNAs that are apparently encapsidated by the virus. Such PeSL “mini-circles” may be important vectors for horizontal gene transfer

    Arctic Ocean Microbial Community Structure before and after the 2007 Record Sea Ice Minimum

    Get PDF
    Increasing global temperatures are having a profound impact in the Arctic, including the dramatic loss of multiyear sea ice in 2007 that has continued to the present. The majority of life in the Arctic is microbial and the consequences of climate-mediated changes on microbial marine food webs, which are responsible for biogeochemical cycling and support higher trophic levels, are unknown. We examined microbial communities over time by using high-throughput sequencing of microbial DNA collected between 2003 and 2010 from the subsurface chlorophyll maximum (SCM) layer of the Beaufort Sea (Canadian Arctic). We found that overall this layer has freshened and concentrations of nitrate, the limiting nutrient for photosynthetic production in Arctic seas, have decreased. We compared microbial communities from before and after the record September 2007 sea ice minimum and detected significant differences in communities from all three domains of life. In particular, there were significant changes in species composition of Eukarya, with ciliates becoming more common and heterotrophic marine stramenopiles (MASTs) accounting for a smaller proportion of sequences retrieved after 2007. Within the Archaea, Marine Group I Thaumarchaeota, which earlier represented up to 60% of the Archaea sequences in this layer, have declined to <10%. Bacterial communities overall were less diverse after 2007, with a significant decrease of the Bacteroidetes. These significant shifts suggest that the microbial food webs are sensitive to physical oceanographic changes such as those occurring in the Canadian Arctic over the past decade

    Denoising the Denoisers: an independent evaluation of microbiome sequence error-correction approaches

    Get PDF
    High-depth sequencing of universal marker genes such as the 16S rRNA gene is a common strategy to profile microbial communities. Traditionally, sequence reads are clustered into operational taxonomic units (OTUs) at a defined identity threshold to avoid sequencing errors generating spurious taxonomic units. However, there have been numerous bioinformatic packages recently released that attempt to correct sequencing errors to determine real biological sequences at single nucleotide resolution by generating amplicon sequence variants (ASVs). As more researchers begin to use high resolution ASVs, there is a need for an in-depth and unbiased comparison of these novel “denoising” pipelines. In this study, we conduct a thorough comparison of three of the most widely-used denoising packages (DADA2, UNOISE3, and Deblur) as well as an open-reference 97% OTU clustering pipeline on mock, soil, and host-associated communities. We found from the mock community analyses that although they produced similar microbial compositions based on relative abundance, the approaches identified vastly different numbers of ASVs that significantly impact alpha diversity metrics. Our analysis on real datasets using recommended settings for each denoising pipeline also showed that the three packages were consistent in their per-sample compositions, resulting in only minor differences based on weighted UniFrac and Bray–Curtis dissimilarity. DADA2 tended to find more ASVs than the other two denoising pipelines when analyzing both the real soil data and two other host-associated datasets, suggesting that it could be better at finding rare organisms, but at the expense of possible false positives. The open-reference OTU clustering approach identified considerably more OTUs in comparison to the number of ASVs from the denoising pipelines in all datasets tested. The three denoising approaches were significantly different in their run times, with UNOISE3 running greater than 1,200 and 15 times faster than DADA2 and Deblur, respectively. Our findings indicate that, although all pipelines result in similar general community structure, the number of ASVs/OTUs and resulting alpha-diversity metrics varies considerably and should be considered when attempting to identify rare organisms from possible background noise

    The gp38 Adhesins of the T4 Superfamily: A Complex Modular Determinant of the Phage’s Host Specificity

    Get PDF
    The tail fiber adhesins are the primary determinants of host range in the T4-type bacteriophages. Among the indispensable virion components, the sequences of the long tail fiber genes and their associated adhesins are among the most variable. The predominant form of the adhesin in the T4-type phages is not even the version of the gene encoded by T4, the archetype of the superfamily, but rather a small unrelated protein (gp38) encoded by closely related phages such as T2 and T6. This gp38 adhesin has a modular design: its N-terminal attachment domain binds at the tip of the tail fiber, whereas the C-terminal specificity domain determines its host receptor affinity. This specificity domain has a series of four hypervariable segments (HVSs) that are separated by a set of highly conserved glycine-rich motifs (GRMs) that apparently form the domain’s conserved structural core. The role of gp38’s various components was examined by a comparative analysis of a large series of gp38 adhesins from T-even superfamily phages with differing host specificities. A deletion analysis revealed that the individual HVSs and GRMs are essential to the T6 adhesin’s function and suggests that these different components all act in synergy to mediate adsorption. The evolutionary advantages of the modular design of the adhesin involving both conserved structural elements and multiple independent and easily interchanged specificity determinants are discussed

    Functional Annotation of the Ophiostoma novo-ulmi Genome: Insights into the Phytopathogenicity of the Fungal Agent of Dutch Elm Disease

    No full text
    International audienceThe ascomycete fungus Ophiostoma novo-ulmi is responsible for the pandemic of Dutch elm disease that has been ravaging Europe and North America for 50 years. We proceeded to annotate the genome of the O. novo-ulmi strain H327 that was sequenced in 2012. The 31.784-Mb nuclear genome (50.1% GC) is organized into 8 chromosomes containing a total of 8,640 protein-coding genes that we validated with RNA sequencing analysis. Approximately 53% of these genes have their closest match to Grosmannia clavigera kw1407, followed by 36% in other close Sordariomycetes, 5% in other Pezizomycotina, and surprisingly few (5%) orphans. A relatively small portion (~3.4%) of the genome is occupied by repeat sequences; however, the mechanism of repeat-induced point mutation appears active in this genome. Approximately 76% of the proteins could be assigned functions using Gene Ontology analysis; we identified 311 carbohydrate-active enzymes, 48 cytochrome P450s, and 1,731 proteins potentially involved in pathogen– host interaction, along with 7 clusters of fungal secondary metabolites. Complementary mating-type locus sequencing, mating tests, and culturing in the presence of elm terpenes were conducted. Our analysis identified a specific genetic arsenal impacting the sexual and vegetative growth, phytopathogenicity, and signaling/plant–defense–degradation relationship between O. novo-ulmi and its elm host and insect vectors. Introduction During the last centuries, increased movements of people and goods across countries and continents have favored the emergence and global spread of plant pathogens, insect pests, and invasive weeds which have substantially altered the landscape of several parts of the world. One well-documented example is Dutch elm disease (DED), the most destructive disease of elms. It has been estimated that over 1 billion mature elms were killed by two successive pandemics since the early 1900s (Paoletti et al. 2005). The first pandemic, which prompted initial investigations by Dutch scientists shortly after the First World War (Holmes and Heybroek 1990), was caused by the ascomycete fungus Ophiostoma ulmi (Buisman) Nannf. As it spread relentlessly over Western Europe and, a few decade

    Evaluation of multiple protein docking structures using correctly predicted pairwise subunits

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many functionally important proteins in a cell form complexes with multiple chains. Therefore, computational prediction of multiple protein complexes is an important task in bioinformatics. In the development of multiple protein docking methods, it is important to establish a metric for evaluating prediction results in a reasonable and practical fashion. However, since there are only few works done in developing methods for multiple protein docking, there is no study that investigates how accurate structural models of multiple protein complexes should be to allow scientists to gain biological insights.</p> <p>Methods</p> <p>We generated a series of predicted models (decoys) of various accuracies by our multiple protein docking pipeline, Multi-LZerD, for three multi-chain complexes with 3, 4, and 6 chains. We analyzed the decoys in terms of the number of correctly predicted pair conformations in the decoys.</p> <p>Results and conclusion</p> <p>We found that pairs of chains with the correct mutual orientation exist even in the decoys with a large overall root mean square deviation (RMSD) to the native. Therefore, in addition to a global structure similarity measure, such as the global RMSD, the quality of models for multiple chain complexes can be better evaluated by using the local measurement, the number of chain pairs with correct mutual orientation. We termed the fraction of correctly predicted pairs (RMSD at the interface of less than 4.0Å) as <it>fpair </it>and propose to use it for evaluation of the accuracy of multiple protein docking.</p

    Whole-Genome Sequencing and Comparative Genomics of Three Helicobacter pylori Strains Isolated from the Stomach of a Patient with Adenocarcinoma

    Full text link
    Helicobacter pylori is a common pathogen associated with several severe digestive diseases. Although multiple virulence factors have been described, it is still unclear the role of virulence factors on H. pylori pathogenesis and disease progression. Whole genome sequencing could help to find genetic markers of virulence strains. In this work, we analyzed three complete genomes from isolates obtained at the same point in time from a stomach of a patient with adenocarcinoma, using multiple available bioinformatics tools. The genome analysis of the strains B508A-S1, B508A-T2A and B508A-T4 revealed that they were cagA, babA and sabB/hopO negative. The differences among the three genomes were mainly related to outer membrane proteins, methylases, restriction modification systems and flagellar biosynthesis proteins. The strain B508A-T2A was the only one presenting the genotype vacA s1, and had the most distinct genome as it exhibited fewer shared genes, higher number of unique genes, and more polymorphisms were found in this genome. With all the accumulated information, no significant differences were found among the isolates regarding virulence and origin of the isolates. Nevertheless, some B508A-T2A genome characteristics could be linked to the pathogenicity of H. pylori. Keywords: Helicobacter pylori; genomic comparison; virulence factors; gastric adenocarcinom
    corecore