17 research outputs found

    VirSorter Curated Dataset genbank file package

    No full text
    This zip package includes the annotated genbank files (affiliation against PFAM and Refseq) of all viral sequences predicted in the VirSorter Curated Dataset, organized in different folders based on the host of the virus (i.e. the taxonomy of the genomic data in which the viral sequence was identified)

    An Improved Whole-Cell Biosensor for the Discovery of Lignin-Transforming Enzymes in Functional Metagenomic Screens

    No full text
    The discovery and utilization of biocatalysts that selectively valorize lignocellulose is critical to the profitability of next-generation biorefineries. Here, we report the development of a refactored, whole-cell, GFP-based biosensor for high-throughput identification of biocatalysts that transform lignin into specialty chemicals from environmental DNA of uncultivable archaea and bacteria. The biosensor comprises the transcriptional regulator and promoter of the emrRAB operon of <i>E. coli</i>, and the configuration of the biosensor was tuned with the aid of mathematical model. The biosensor sensitively and selectively detects vanillin and syringaldehyde, and responds linearly over a wide detection range. We employed the biosensor to screen 42 520 fosmid clones comprising environmental DNA isolated from two coal beds and successfully identified 147 clones that transform hardwood kraft lignin to vanillin and syringaldehyde

    <i>In Silico</i> Analysis of the Metabolic Potential and Niche Specialization of Candidate Phylum "<i>Latescibacteria</i>" (WS3)

    Get PDF
    <div><p>The “<i>Latescibacteria</i>” (formerly WS3), member of the Fibrobacteres–Chlorobi–Bacteroidetes (FCB) superphylum, represents a ubiquitous <i>candidate phylum found in</i> terrestrial, aquatic, and marine ecosystems. Recently, single-cell amplified genomes (SAGs) representing the “<i>Latescibacteria</i>” were obtained from the anoxic monimolimnion layers of Sakinaw Lake (British Columbia, Canada), and anoxic sediments of a coastal lagoon (Etoliko lagoon, Western Greece). Here, we present a detailed <i>in-silico</i> analysis of the four SAGs to gain some insights on their metabolic potential and apparent ecological roles. Metabolic reconstruction suggests an anaerobic fermentative mode of metabolism, as well as the capability to degrade multiple polysaccharides and glycoproteins that represent integral components of green (Charophyta and Chlorophyta) and brown (Phaeophycaea) algae cell walls (pectin, alginate, ulvan, fucan, hydroxyproline-rich glycoproteins), storage molecules (starch and trehalose), and extracellular polymeric substances (EPSs). The analyzed SAGs also encode dedicated transporters for the uptake of produced sugars and amino acids/oligopeptides, as well as an extensive machinery for the catabolism of all transported sugars, including the production of a bacterial microcompartment (BMC) to sequester propionaldehyde, a toxic intermediate produced during fucose and rhamnose metabolism. Finally, genes for the formation of gas vesicles, flagella, type IV pili, and oxidative stress response were found, features that could aid in cellular association with algal detritus. Collectively, these results indicate that the analyzed <i>“Latescibacteria”</i> mediate the turnover of multiple complex organic polymers of algal origin that reach deeper anoxic/microoxic habitats in lakes and lagoons. The implications of such process on our understanding of niche specialization in microbial communities mediating organic carbon turnover in stratified water bodies are discussed.</p></div

    Import systems in “<i>Latescibacteria</i>” predicted from the SAGs.

    No full text
    <p>Extracellular degradation of polymers, as detailed in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127499#pone.0127499.t002" target="_blank">Table 2</a>, results in the production of monomers that could potentially be transported across the outer membrane (OM) of “<i>Latescibacteria</i>” cell wall through non-specific outer membrane porins (OMP). In the periplasm, those monomers are then transported across the inner membrane (IM) via dedicated transporters including (1) Secondary transporters: glucosamine (GluA), galactosamine (GalA), and 5-dehydro-4-deoxy-glucosamine (5-dehydro-4-deoxy-GluA) are potentially imported using a single common transporter ExuT. Fucose (Fuc), rhamnose (Rha), and xylose (Xyl) are imported via dedicated proton symporters, while glucose (Glu), and galactose (Gal) are imported via dedicated sodium symporters. (2) ATP-binding cassette (ABC) transporters: ribose (Rib) and arabinose (Ara) sugars, as well as oligopeptides and dipeptides have dedicated ABC transporters with specific periplasmic substrate binding protein (SBP), two membrane permeases (P), and an ATPase. And (3) Phosphotransferase system (PTS) transporters: mannose (Man), fructose (Fru), galactosamine (GalN), and N-acetyl galactosamine (N-Ac-GalN) are imported via dedicated PTS transporters with cytoplasmic enzyme-I component (E-I) and membrane associated enzyme II components (IIA, IIB, and IIC). Sugars are phosphorylated during this kind of transport. The SAGs also encode a dedicated signal transduction system, and a tripartite ATP-independnent transporter (TRAP) for sensing, and importing, respectively, dicarboxylates, e.g. malate, and tricarboxylates, e.g. citrate, across the inner membrane. The signal transduction system is composed of the sensor histidine kinase DctB, and the cytoplasmic response regulator DctD, while the TRAP transporter is composed of the periplasmic solute receptor (DctP), the membrane small permease component (DctQ), and the membrane large permease component (DctM). TonB-dependent import of vitamin B12 and iron complexes is also predicted from the SAGs. Several proteins with Plug domains could potentially act as the outer membrane receptor protein for vitamin B12 and iron complexes. Binding of the ligand to the receptor activates TonB-dependent import across the outer membrane via three proteins TonB, ExbB, and ExbD, that couple proton motive force to ligand transport across the outer membrane. In the periplasm, vitamin B12 or iron complexes are then transported across the inner membrane via a dedicated ABC transporter.</p

    Total number of PLs (white columns) and GHs (black columns) per Mbp of various pectinolytic and lignocellulolytic microorganisms’ genomes.

    No full text
    <p>Note that, compared to other genomes, “<i>Latescibacteria</i>” SAGS are enriched in PLs as opposed to GHs. The inset shows SAGs S-E07 and S-B13 different PL families as a fraction of total PLs.</p

    Metabolic reconstruction deduced from “Latescbacteria SAGs”.

    No full text
    <p>Metabolism is shown for the monomers produced during extracellular degradation of polymers (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127499#pone.0127499.t002" target="_blank">Table 2</a>) followed by their transport across the outer and inner membranes as shown in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127499#pone.0127499.g003" target="_blank">Fig 3</a>. Three major routes are shown (depicted by red boxes) for the degradation of those monomers, Embden-Meyerhof-Paranas (EMP) pathway, Pentose phosphate pathway (PPP), and bacterial microcompartment (BMC) pathway. The BMC is depicted by an octahedral structure showing all reactions thought to occur inside of the BMC. All possible substrates potentially supporting growth are shown in blue, predicted final products are shown in red, and reactions with substrate level phosphorylations are shown by red arrows. Abbreviations (other than those mentioned in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127499#pone.0127499.g003" target="_blank">Fig 3</a> legend): KDG, 2-dehydro-3-deoxy-D-gluconate; Pyr, pyruvate; Asp, aspartic acid; OAA, oxaloacetate; α-KG, α-ketoglutarate; Glu, glucose; Fru, fructose; Fru-1,6-PP, fructose-1,6-bisphosphate; DHAP, dihydroxyacetone phosphate; GAP, glyceraldehyde-3-phosphate; BPG, bisphosphoglycerate; G-3-P, 3-phosphoglycerate; G-2-P, 2-phosphoglycerate; PEP, phosphoenolpyruvate; Man, mannose; Gal, galactose; NAG, N-acetylglucosamine; NAGal, N-acetylgalactosamine; GluN, glucosamine; GalN, galactosamineRib, ribose; Ribu, ribulose; Xyl, xylose; Xylu, xylulose; Ara, arabinose; Rha, rhamnose; Fuc, fucose; L-Ald, lactaldehyde; 1,2-PD, 1,2-propanediol; P-ald, propionaldehyde; Prop-CoA, propionyl-CoA.</p

    Updated taxonomic outline for candidate phylum “<i>Latescibacteria</i>” (A), and for the candidate order PBSIII_9 (B).

    No full text
    <p>Neighbor joining trees were constructed using Jukes-Cantor corrections in MEGA6-Beta2 [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127499#pone.0127499.ref100" target="_blank">100</a>]. Bootstrap values (in percent) are based on 1000 replicates and are shown for branches with more than 50% bootstrap support. Numbers in parentheses represent the number of sequences in each WS3 candidate order.</p

    Schematic representation of algal cell walls.

    No full text
    <p>The cell wall composition differs between various algal groups [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127499#pone.0127499.ref043" target="_blank">43</a>]. Within the Charophyta (A), the wall is formed of an inner fibrillar layer made of cellulose microfibrils. The fibrillar layer is enmeshed in and surrounded by a middle amorphous matrix of pectin (homogalacturonan, HG, and rhamnogalacturonan I, RGI) that anchors the inner fibrillar cellulose layer to an outer lattice of homogalacturonan. Extracellular polymeric substances or mucilages are also present outside the outer lattice [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127499#pone.0127499.ref038" target="_blank">38</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127499#pone.0127499.ref043" target="_blank">43</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127499#pone.0127499.ref101" target="_blank">101</a>]. Similarly, cell walls of Chlorophyta (B) contain skeletal polysaccharides enmeshed in a matrix. However, the skeletal polysaccharides in Chlorophyta cell walls form double fibrillar layers (inner layer and outer layer) with an amorphous matrix in between. The fibrillar layers vary in composition between cellulose, β-1,3-xylans or β-1,4-mannans or complex heteropolymers, and are rich in hydroproline-rich glycoprotein such as extensins and AGPs. The amorphous matrix polysaccharides are generally in the form of ulvans (e.g. in Ulva species). Brown algal cell walls (C) consist of a fibrillar framework of cellulose microfibrils present in layers parallel to the cell surface but with no clear orientation within each layer. Two such layers are depicted in the figure. All cellulose layers are enmeshed in acidic polysaccharides, e.g. alginates. The interfibrillar matrices are composed of alginates and fucans [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127499#pone.0127499.ref041" target="_blank">41</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127499#pone.0127499.ref043" target="_blank">43</a>]<b>.</b></p

    DataSheet1_Forecasting SARS-CoV-2 spike protein evolution from small data by deep learning and regression.FASTA

    No full text
    The emergence of SARS-CoV-2 variants during the COVID-19 pandemic caused frequent global outbreaks that confounded public health efforts across many jurisdictions, highlighting the need for better understanding and prediction of viral evolution. Predictive models have been shown to support disease prevention efforts, such as with the seasonal influenza vaccine, but they require abundant data. For emerging viruses of concern, such models should ideally function with relatively sparse data typically encountered at the early stages of a viral outbreak. Conventional discrete approaches have proven difficult to develop due to the spurious and reversible nature of amino acid mutations and the overwhelming number of possible protein sequences adding computational complexity. We hypothesized that these challenges could be addressed by encoding discrete protein sequences into continuous numbers, effectively reducing the data size while enhancing the resolution of evolutionarily relevant differences. To this end, we developed a viral protein evolution prediction model (VPRE), which reduces amino acid sequences into continuous numbers by using an artificial neural network called a variational autoencoder (VAE) and models their most statistically likely evolutionary trajectories over time using Gaussian process (GP) regression. To demonstrate VPRE, we used a small amount of early SARS-CoV-2 spike protein sequences. We show that the VAE can be trained on a synthetic dataset based on this data. To recapitulate evolution along a phylogenetic path, we used only 104 spike protein sequences and trained the GP regression with the numerical variables to project evolution up to 5 months into the future. Our predictions contained novel variants and the most frequent prediction mapped primarily to a sequence that differed by only a single amino acid from the most reported spike protein within the prediction timeframe. Novel variants in the spike receptor binding domain (RBD) were capable of binding human angiotensin-converting enzyme 2 (ACE2) in silico, with comparable or better binding than previously resolved RBD-ACE2 complexes. Together, these results indicate the utility and tractability of combining deep learning and regression to model viral protein evolution with relatively sparse datasets, toward developing more effective medical interventions.</p

    Table1_Forecasting SARS-CoV-2 spike protein evolution from small data by deep learning and regression.xlsx

    No full text
    The emergence of SARS-CoV-2 variants during the COVID-19 pandemic caused frequent global outbreaks that confounded public health efforts across many jurisdictions, highlighting the need for better understanding and prediction of viral evolution. Predictive models have been shown to support disease prevention efforts, such as with the seasonal influenza vaccine, but they require abundant data. For emerging viruses of concern, such models should ideally function with relatively sparse data typically encountered at the early stages of a viral outbreak. Conventional discrete approaches have proven difficult to develop due to the spurious and reversible nature of amino acid mutations and the overwhelming number of possible protein sequences adding computational complexity. We hypothesized that these challenges could be addressed by encoding discrete protein sequences into continuous numbers, effectively reducing the data size while enhancing the resolution of evolutionarily relevant differences. To this end, we developed a viral protein evolution prediction model (VPRE), which reduces amino acid sequences into continuous numbers by using an artificial neural network called a variational autoencoder (VAE) and models their most statistically likely evolutionary trajectories over time using Gaussian process (GP) regression. To demonstrate VPRE, we used a small amount of early SARS-CoV-2 spike protein sequences. We show that the VAE can be trained on a synthetic dataset based on this data. To recapitulate evolution along a phylogenetic path, we used only 104 spike protein sequences and trained the GP regression with the numerical variables to project evolution up to 5 months into the future. Our predictions contained novel variants and the most frequent prediction mapped primarily to a sequence that differed by only a single amino acid from the most reported spike protein within the prediction timeframe. Novel variants in the spike receptor binding domain (RBD) were capable of binding human angiotensin-converting enzyme 2 (ACE2) in silico, with comparable or better binding than previously resolved RBD-ACE2 complexes. Together, these results indicate the utility and tractability of combining deep learning and regression to model viral protein evolution with relatively sparse datasets, toward developing more effective medical interventions.</p
    corecore