6,939 research outputs found

    Bacterial riboproteogenomics : the era of N-terminal proteoform existence revealed

    Get PDF
    With the rapid increase in the number of sequenced prokaryotic genomes, relying on automated gene annotation became a necessity. Multiple lines of evidence, however, suggest that current bacterial genome annotations may contain inconsistencies and are incomplete, even for so-called well-annotated genomes. We here discuss underexplored sources of protein diversity and new methodologies for high-throughput genome re-annotation. The expression of multiple molecular forms of proteins (proteoforms) from a single gene, particularly driven by alternative translation initiation, is gaining interest as a prominent contributor to bacterial protein diversity. In consequence, riboproteogenomic pipelines were proposed to comprehensively capture proteoform expression in prokaryotes by the complementary use of (positional) proteomics and the direct readout of translated genomic regions using ribosome profiling. To complement these discoveries, tailored strategies are required for the functional characterization of newly discovered bacterial proteoforms

    Ribosome signatures aid bacterial translation initiation site identification

    Get PDF
    Background: While methods for annotation of genes are increasingly reliable, the exact identification of translation initiation sites remains a challenging problem. Since the N-termini of proteins often contain regulatory and targeting information, developing a robust method for start site identification is crucial. Ribosome profiling reads show distinct patterns of read length distributions around translation initiation sites. These patterns are typically lost in standard ribosome profiling analysis pipelines, when reads from footprints are adjusted to determine the specific codon being translated. Results: Utilising these signatures in combination with nucleotide sequence information, we build a model capable of predicting translation initiation sites and demonstrate its high accuracy using N-terminal proteomics. Applying this to prokaryotic translatomes, we re-annotate translation initiation sites and provide evidence of N-terminal truncations and extensions of previously annotated coding sequences. These re-annotations are supported by the presence of structural and sequence-based features next to N-terminal peptide evidence. Finally, our model identifies 61 novel genes previously undiscovered in the Salmonella enterica genome. Conclusions: Signatures within ribosome profiling read length distributions can be used in combination with nucleotide sequence information to provide accurate genome-wide identification of translation initiation sites

    Fecal contamination of drinking-water in low- and middle-income countries: a systematic review and meta-analysis

    Get PDF
    Background: access to safe drinking-water is a fundamental requirement for good health and is also a human right. Global access to safe drinking-water is monitored by WHO and UNICEF using as an indicator “use of an improved source,” which does not account for water quality measurements. Our objectives were to determine whether water from “improved” sources is less likely to contain fecal contamination than “unimproved” sources and to assess the extent to which contamination varies by source type and setting.Methods and findings: studies in Chinese, English, French, Portuguese, and Spanish were identified from online databases, including PubMed and Web of Science, and grey literature. Studies in low- and middle-income countries published between 1990 and August 2013 that assessed drinking-water for the presence of Escherichia coli or thermotolerant coliforms (TTC) were included provided they associated results with a particular source type. In total 319 studies were included, reporting on 96,737 water samples. The odds of contamination within a given study were considerably lower for “improved” sources than “unimproved” sources (odds ratio [OR] = 0.15 [0.10–0.21], I2 = 80.3% [72.9–85.6]). However over a quarter of samples from improved sources contained fecal contamination in 38% of 191 studies. Water sources in low-income countries (OR = 2.37 [1.52–3.71]; p<0.001) and rural areas (OR = 2.37 [1.47–3.81] p<0.001) were more likely to be contaminated. Studies rarely reported stored water quality or sanitary risks and few achieved robust random selection. Safety may be overestimated due to infrequent water sampling and deterioration in quality prior to consumption.Conclusion: access to an “improved source” provides a measure of sanitary protection but does not ensure water is free of fecal contamination nor is it consistent between source types or settings. International estimates therefore greatly overstate use of safe drinking-water and do not fully reflect disparities in access. An enhanced monitoring strategy would combine indicators of sanitary protection with measures of water qualit

    CoBaltDB: Complete bacterial and archaeal orfeomes subcellular localization database and associated resources

    Get PDF
    International audienceBACKGROUND: The functions of proteins are strongly related to their localization in cell compartments (for example the cytoplasm or membranes) but the experimental determination of the sub-cellular localization of proteomes is laborious and expensive. A fast and low-cost alternative approach is in silico prediction, based on features of the protein primary sequences. However, biologists are confronted with a very large number of computational tools that use different methods that address various localization features with diverse specificities and sensitivities. As a result, exploiting these computer resources to predict protein localization accurately involves querying all tools and comparing every prediction output; this is a painstaking task. Therefore, we developed a comprehensive database, called CoBaltDB, that gathers all prediction outputs concerning complete prokaryotic proteomes. DESCRIPTION: The current version of CoBaltDB integrates the results of 43 localization predictors for 784 complete bacterial and archaeal proteomes (2.548.292 proteins in total). CoBaltDB supplies a simple user-friendly interface for retrieving and exploring relevant information about predicted features (such as signal peptide cleavage sites and transmembrane segments). Data are organized into three work-sets ("specialized tools", "meta-tools" and "additional tools"). The database can be queried using the organism name, a locus tag or a list of locus tags and may be browsed using numerous graphical and text displays. CONCLUSIONS: With its new functionalities, CoBaltDB is a novel powerful platform that provides easy access to the results of multiple localization tools and support for predicting prokaryotic protein localizations with higher confidence than previously possible. CoBaltDB is available at http://www.umr6026.univ-rennes1.fr/english/home/research/basic/software/cobalten

    Microbial carbon use efficiency predicted from genome-scale metabolic models

    Get PDF
    Respiration by soil bacteria and fungi is one of the largest fluxes of carbon (C) from the land surface. Although this flux is a direct product of microbial metabolism, controls over metabolism and their responses to global change are a major uncertainty in the global C cycle. Here, we explore an in silico approach to predict bacterial C-use efficiency (CUE) for over 200 species using genome-specific constraint-based metabolic modeling. We find that potential CUE averages 0.62 ± 0.17 with a range of 0.22 to 0.98 across taxa and phylogenetic structuring at the subphylum levels. Potential CUE is negatively correlated with genome size, while taxa with larger genomes are able to access a wider variety of C substrates. Incorporating the range of CUE values reported here into a next-generation model of soil biogeochemistry suggests that these differences in physiology across microbial taxa can feed back on soil-C cycling.Published versio

    Costless metabolic secretions as drivers of interspecies interactions in microbial ecosystems

    Get PDF
    Metabolic exchange mediates interactions among microbes, helping explain diversity in microbial communities. As these interactions often involve a fitness cost, it is unclear how stable cooperation can emerge. Here we use genome-scale metabolic models to investigate whether the release of “costless” metabolites (i.e. those that cause no fitness cost to the producer), can be a prominent driver of intermicrobial interactions. By performing over 2 million pairwise growth simulations of 24 species in a combinatorial assortment of environments, we identify a large space of metabolites that can be secreted without cost, thus generating ample cross-feeding opportunities. In addition to providing an atlas of putative interactions, we show that anoxic conditions can promote mutualisms by providing more opportunities for exchange of costless metabolites, resulting in an overrepresentation of stable ecological network motifs. These results may help identify interaction patterns in natural communities and inform the design of synthetic microbial consortia.We thank Dr. Niels Klitgord for pioneering ideas that inspired launch of this work. We are also grateful to David Bernstein, Joshua E. Goldford, Meghan Thommes, Demetrius DiMucci, and all members of the Segre Lab for helpful discussions. A.R.P. is supported by a National Academies of Sciences, Engineering, and Medicine Ford Foundation Predoctoral Fellowship and a Howard Hughes Medical Institute Gilliam Fellowship. This work was supported by funding from the Defense Advanced Research Projects Agency (purchase request no. HR0011515303, contract no. HR0011-15-C-0091), the U.S. Department of Energy (grants DE-SC0004962 and DE-SC0012627), the NIH (grants 5R01DE024468, R01GM121950, and Sub_P30DK036836_P&F), the National Science Foundation (grants 1457695 and NSFOCE-BSF 1635070), MURI Grant W911NF-12-1-0390, the Human Frontiers Science Program (grant RGP0020/2016), and the Boston University Inter-disciplinary Biomedical Research Office. (National Academies of Sciences, Engineering, and Medicine Ford Foundation Predoctoral Fellowship; Howard Hughes Medical Institute Gilliam Fellowship; HR0011515303 - Defense Advanced Research Projects Agency; HR0011-15-C-0091 - Defense Advanced Research Projects Agency; DE-SC0004962 - U.S. Department of Energy; DE-SC0012627 - U.S. Department of Energy; 5R01DE024468 - NIH; R01GM121950 - NIH; Sub_P30DK036836_PF - NIH; 1457695 - National Science Foundation; NSFOCE-BSF 1635070 - National Science Foundation; W911NF-12-1-0390 - MURI Grant; RGP0020/2016 - Human Frontiers Science Program; Boston University Inter-disciplinary Biomedical Research Office)Published versio

    RNA-Seq Data-Mining Allows the Discovery of Two Long Non-Coding RNA Biomarkers of Viral Infection in Humans

    Get PDF
    There is a growing interest in unraveling gene expression mechanisms leading to viral host invasion and infection progression. Current findings reveal that long non-coding RNAs (lncRNAs) are implicated in the regulation of the immune system by influencing gene expression through a wide range of mechanisms. By mining whole-transcriptome shotgun sequencing (RNA-seq) data using machine learning approaches, we detected two lncRNAs (ENSG00000254680 and ENSG00000273149) that are downregulated in a wide range of viral infections and different cell types, including blood monocluclear cells, umbilical vein endothelial cells, and dermal fibroblasts. The efficiency of these two lncRNAs was positively validated in different viral phenotypic scenarios. These two lncRNAs showed a strong downregulation in virus-infected patients when compared to healthy control transcriptomes, indicating that these biomarkers are promising targets for infection diagnosis. To the best of our knowledge, this is the very first study using host lncRNAs biomarkers for the diagnosis of human viral infectionsThis study received support from the Instituto de Salud Carlos III: project GePEM (Instituto de Salud Carlos III(ISCIII)/PI16/01478/Cofinanciado FEDER), DIAVIR (Instituto de Salud Carlos III(ISCIII)/DTS19/00049/Cofinanciado FEDER; Proyecto de Desarrollo Tecnológico en Salud) and Resvi-Omics (Instituto de Salud Carlos III(ISCIII)/PI19/01039/Cofinanciado FEDER) and project BI-BACVIR (PRIS-3; Agencia de Conocimiento en Salud (ACIS)—Servicio Gallego de Salud (SERGAS)—Xunta de Galicia; Spain) given to A.S.; and project ReSVinext (Instituto de Salud Carlos III(ISCIII)/PI16/01569/Cofinanciado FEDER), and Enterogen (Instituto de Salud Carlos III(ISCIII)/ PI19/01090/Cofinanciado FEDER) given to F.M.-TS

    Highly sensitive quantitative phase microscopy and deep learning aided with whole genome sequencing for rapid detection of infection and antimicrobial resistance

    Get PDF
    Current state-of-the-art infection and antimicrobial resistance (AMR) diagnostics are based on culture-based methods with a detection time of 48–96 h. Therefore, it is essential to develop novel methods that can do real-time diagnoses. Here, we demonstrate that the complimentary use of label-free optical assay with whole-genome sequencing (WGS) can enable rapid diagnosis of infection and AMR. Our assay is based on microscopy methods exploiting label-free, highly sensitive quantitative phase microscopy (QPM) followed by deep convolutional neural networks-based classification. The workflow was benchmarked on 21 clinical isolates from four WHO priority pathogens that were antibiotic susceptibility tested, and their AMR profile was determined by WGS. The proposed optical assay was in good agreement with the WGS characterization. Accurate classification based on the gram staining (100% recall for gram-negative and 83.4% for gram-positive), species (98.6%), and resistant/susceptible type (96.4%), as well as at the individual strain level (100% sensitivity in predicting 19 out of the 21 strains, with an overall accuracy of 95.45%). The results from this initial proof-of-concept study demonstrate the potential of the QPM assay as a rapid and first-stage tool for species, strain-level classification, and the presence or absence of AMR, which WGS can follow up for confirmation. Overall, a combined workflow with QPM and WGS complemented with deep learning data analyses could, in the future, be transformative for detecting and identifying pathogens and characterization of the AMR profile and antibiotic susceptibility
    corecore