Search CORE

30 research outputs found

MegaGTA: a sensitive and accurate metagenomic gene-targeted assembler using iterative de Bruijn graphs

Author: A Bankevich
AM Bolger
BH Bloom
C Miller
C Yuan
Chi-Ming Leung
Dinghua Li
DR Zerbino
Hing-Fung Ting
J Pell
M Rho
N Nagarajan
PE Hart
Q Wang
Q Wang
R Chikhi
R Luo
RC Edgar
Ruibang Luo
SR Eddy
Tak-Wah Lam
Y Peng
Y Zhang
Yukun Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Large-scale 16S gene assembly using metagenomics shotgun sequences.

Author: Chen Ting
Wang Ying
Wang Zicheng
Zeng Feng
Zhou Jizhong
Publication venue: eScholarship, University of California
Publication date: 31/01/2017
Field of study

MotivationCombining a 16S rRNA (16S) gene database with metagenomic shotgun sequences promises unbiased identification of known and novel microbes.ResultsTo achieve this, we herein report reference-based ribosome assembly (RAMBL), a computational pipeline, which integrates taxonomic tree search and Dirichlet process clustering to reconstruct full-length 16S gene sequences from metagenomic sequencing data with high accuracy. By benchmarking against the synthetic and real shotgun sequences, we demonstrated that full-length 16S gene assemblies of RAMBL were a good proxy for known and putative microbes, including Candidate Phyla Radiation. We found that 30-40% of bacteria genera in the terrestrial and intestinal biomes have no closely related genome sequences. We also observed that RAMBL was able to generate a more accurate determination of environmental microbial diversity and yield better disease classification, suggesting that full-length 16S gene assemblies are a powerful alternative to marker gene set and 16S short reads. RAMBL first realizes the access to full-length 16S gene sequences in the near-terabase-scale metagenomic shotgun sequences, which markedly improve metagenomic data analysis and interpretation.Availability and implementationRAMBL is available at https://github.com/homopolymer/RAMBL for academic [email protected] informationSupplementary data are available at Bioinformatics online

Crossref

eScholarship - University of California

Novel canine high-quality metagenome-assembled genomes, prophages and host-associated plasmids provided by long-read metagenomics together with Hi-C proximity ligation

Author: Cuscó Anna
Francino Olga
Fábregas Norma
Pérez Daniel
Viñes Joaquim
Publication venue: 'Microbiology Society'
Publication date: 01/01/2022
Field of study

The human gut microbiome has been extensively studied, yet the canine gut microbiome is still largely unknown. The availability of high-quality genomes is essential in the fields of veterinary medicine and nutrition to unravel the biological role of key microbial members in the canine gut environment. Our aim was to evaluate nanopore long-read metagenomics and Hi-C (high-throughput chromosome conformation capture) proximity ligation to provide high-quality metagenome-assembled genomes (HQ MAGs) of the canine gut environment. By combining nanopore long-read metagenomics and Hi-C proximity ligation, we retrieved 27 HQ MAGs and 7 medium-quality MAGs of a faecal sample of a healthy dog. Canine MAGs (CanMAGs) improved genome contiguity of representatives from the animal and human MAG catalogues - short-read MAGs from public datasets - for the species they represented: they were more contiguous with complete ribosomal operons and at least 18 canonical tRNAs. Both canine-specific bacterial species and gut generalists inhabit the dog's gastrointestinal environment. Most of them belonged to , followed by and . We also assembled one and one MAG. CanMAGs harboured antimicrobial-resistance genes (ARGs) and prophages and were linked to plasmids. ARGs conferring resistance to tetracycline were most predominant within CanMAGs, followed by lincosamide and macrolide ones. At the functional level, carbohydrate transport and metabolism was the most variable within the CanMAGs, and mobilome function was abundant in some MAGs. Specifically, we assigned the mobilome functions and the associated mobile genetic elements to the bacterial host. The CanMAGs harboured 50 bacteriophages, providing novel bacterial-host information for eight viral clusters, and Hi-C proximity ligation data linked the six potential plasmids to their bacterial host. Long-read metagenomics and Hi-C proximity ligation are likely to become a comprehensive approach to HQ MAG discovery and assignment of extra-chromosomal elements to their bacterial host. This will provide essential information for studying the canine gut microbiome in veterinary medicine and animal nutrition

PubMed Central

Diposit Digital de Documents de la UAB

Reconstruction of full-length 16S rRNA sequences for taxonomic assignment inmetagenomics

Author: Blanquart Samuel
Dufresne Yoann
Pericard Pierre
Touzet Hélène
Publication venue: HAL CCSD
Publication date: 03/07/2017
Field of study

National audienceAdvances in the sequencing of uncultured environmental samples, raise a growing need for accurate taxonomic assignment. Accurate identification of organisms present within a community is essential to understanding even the most elementary ecosystems. However, current high-throughput sequencing technologies generate short reads which partially cover full-length marker genes and this poses difficult bioinformatic challenges for taxonomy identification at high resolution. We designed MATAM, a software dedicated to the fast and accurate targeted assembly of short reads sequenced from a genomic marker of interest. The method implements a stepwise process based on construction and analysis of a read overlap graph. It is applied to the assembly of 16S rRNA markers and is validated on simulated, synthetic and genuine metagenomes. We show that MATAM outperforms other available methods in terms of low error rates and recovered genome fractions and is suitable to provide improved assemblies for precise taxonomic assignments

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Illuminating the dynamic rare biosphere of the Greenland Ice Sheet's Dark Zone

Author: Cameron Karen A
Cook Joseph M
Edwards Arwyn
Gokul Jarishma K
Hegarty Matt
Hubbard Alun
Irvine-Fynn Tristram D L
Mur Luis A J
Stibal Marek
Publication venue
Publication date: 01/12/2019
Field of study

Greenland's Dark Zone is the largest contiguous region of bare terrestrial ice in the Northern Hemisphere and microbial processes play an important role in driving its darkening and thereby amplifying melt and runoff from the ice sheet. However, the dynamics of these microbiota have not been fully identified. Here we present joint 16S rRNA gene and 16S rRNA (cDNA) comparison of input (snow), storage (cryoconite), and output (supraglacial stream water) habitats across the Dark Zone over the melt season. We reveal that all three Dark Zone communities have a preponderance of rare taxa exhibiting high protein synthesis potential (PSP). Furthermore, taxa with high PSP represent highly connected ‘bottlenecks’ within community structure, consistent with their roles as metabolic hubs. Finally, low abundance-high PSP taxa affiliated with Methylobacterium within snow and stream water suggest a novel role for Methylobacterium in the carbon cycle of Greenlandic snowpacks, and importantly, the export of potentially active methylotrophs to the bed of the Greenland Ice Sheet. By comparing the dynamics of bulk and potentially active microbiota in the Dark Zone of the Greenland Ice Sheet we provide novel insights into the mechanisms and impacts of the microbial colonization of this critical region of our melting planet

Aberystwyth Research Portal

Enlighten

Towards complete representation of bacterial contents in metagenomic samples

Author: Feng Xiaowen
Li Heng
Publication venue
Publication date: 22/11/2022
Field of study

Background: In the metagenome assembly of a microbiome community, we may think abundant species would be easier to assemble due to their deeper coverage. However, this conjucture is rarely tested. We often do not know how many abundant species we are missing and do not have an approach to recover these species. Results: Here we proposed k-mer based and 16S RNA based methods to measure the completeness of metagenome assembly. We showed that even with PacBio High-Fidelity (HiFi) reads, abundant species are often not assembled as high strain diversity may lead to fragmented contigs. We developed a novel algorithm to recover abundant metagenome-assembled genomes (MAGs) by identifying circular assembly subgraphs. Our algorithm is reference-free and complement to standard metagenome binning. Evaluated on 14 real datasets, it rescued many abundant species that would be missing with existing methods. Conclusions: Our work stresses the importance of metagenome completeness which is often overlooked before. Our algorithm generates more circular MAGs and moves a step closer to the complete representation of microbiome communities

arXiv.org e-Print Archive

Поддержка расширенных контекстно-свободных грамматик в алгоритме синтаксического анализа Generalised LL

Author: Горохов Артем Владимирович
Gorokhov Artem
Publication venue
Publication date: 01/01/2017
Field of study

Горохов Артем Владимирович Поддержка расширенных контекстно-свободных грамматик в алгоритме синтаксического анализа Generalised LL кандидат физико-математических наук Семен Вячеславович Григорьев Направление математика и механика, кафедра системного программирования Синтаксический анализ играет важную роль в статическом анализе программ: на этом этапе анализа создаётся структурное представление кода, над которым производится дальнейший анализ. Инструменты для генерации синтаксических анализаторов по спецификации языка автоматзируют разработку анализаторов. Обычно спецификацией служит неоднозначная грамматика в расширенной форме Бэкуса-Наура (EBNF), но большинство инструментов не могут использовать данную форму без преобразования. Автоматическое преобразование грамматик обычно приводит к снижению производительности анализа. Существуют подходы к синтаксическому анализу EBNF-грамматик, но они не допускают неоднозначностей в граматиках. С другой стороны, алгоритм Generalised LL позволяет использовать неоднозначные BNF-грамматики и показывает хорошую производительность, но не может работать с EBNF-грамматиками. В этой работе предлагается модификация алгоритма GLL, позволяющая использовать формат граматик, который тесно связан с EBNF: расширенные контекстно-свободные грамматки. Кроме того, было показано, что модификация увеличивает производительность алгоритма по сравнению с основанным на преобразовании EBNF. Использованных источников: 32 Горохов, А. В. Поддержка расширенных контекстно-свободных грамматик в алгоритме синтаксического анализа Generalised LL: выпускная квалификационная работа: защищена 09.06.2017 / Горохов Артем Владимирович. – СПб., 2017. – 37 с. – Библиография: с. 31–34.Gorokhov Artem Vladimirovich Support of extended context-free grammars in Generalised LL parsing algorithm Associate professor Semyon Grigorev. Mathematics & mechanics, software engineering department Parsing plays an important role in static program analysis: during this step a structural representation of code is created upon which further analysis is performed. Parser generator tools, being provided with syntax specification, automate parser development. Language documentation often acts as such specification. Documentation usually takes form of ambiguous grammar in Extended Backus-Naur Form which most parser generators fail to process. Automatic grammar transformation generally leads to parsing performance decrease. Some approaches support EBNF grammars natively, but they all fail to handle ambiguous grammars. On the other hand, Generalised LL parsing algorithm admits arbitrary context-free grammars and achieves good performance, but cannot handle EBNF grammars. The main contribution of this paper is a modification of GLL algorithm which can process grammars in a form which is closely related to EBNF (Extended Context-Free Grammar). We also show that the modification improves parsing performance as compared to grammar transformation-based approach. Sources cited: 32 Gorokhov, A. V. Support of extended context-free grammars in Generalised LL parsing algorithm: Graduation thesis: Defended 09.06.2017 / Gorokhov Artem Vladimirovich. – St. Petersburg., 2017. – 37 pp. – Bibliography: pp. 21-34

Lirias

Saint Petersburg State University

Unlinked rRNA genes are widespread among bacteria and archaea

Author: Albertsen Mads
Brewer Tess E.
Edwards Arwyn
Fierer Noah
Kirkegaard Rasmus
Rocha Eduardo
Publication venue
Publication date: 01/02/2020
Field of study

International audienceRibosomes are essential to cellular life and the genes for their RNA components arethe most conserved and transcribed genes in Bacteria and Archaea. Ribosomal rRNA genes are typically organized into a single operon, an arrangement thought to facilitate gene regulation. In reality, some Bacteria and Archaea do not share this canonical rRNA arrangement - their 16S and 23S rRNA genes are separated across the genome and referred to as "unlinked". This rearrangement has previously been treated as an anomaly or a byproduct of genome degradation in intracellular bacteria. Here, we leverage complete genome and long-read metagenomic data to show that unlinked 16S and 23S rRNA genes are more common than previously thought. Unlinked rRNA genes occur in many phyla, most significantly within Deinococcus-Thermus, Chloroflexi, and Planctomycetes, and occur in differential frequencies across natural environments. We found that up to 41% of rRNA genes in soil were unlinked, in contrast to the human gut, where all sequenced rRNA genes were linked. The frequency of unlinked rRNA genes may reflect meaningful life history traits, as they tend to be associated with a mix of slow-growing free-living species and intracellular species. We speculate that unlinked rRNA genes may confer selective advantages in some environments, though the specific nature of these advantages remains undetermined and worthy of further investigation. More generally, the prevalence of unlinked rRNA genes in poorly-studied taxa serves as a reminder that paradigms derived from model organisms do not necessarily extend to the broader diversity of Bacteria and Archaea

Crossref

Aberystwyth Research Portal

VBN

HAL-Pasteur

Genomic evidence for sulfur intermediates as new biogeochemical hubs in a model aquatic microbial ecosystem

Author: Couture Raoul-Marie
Cruaud Perrine
Culley Alexander
Lovejoy Connie.
Vigneron Adrien
Vincent Warwick F.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/02/2021
Field of study

Background: The sulfur cycle encompasses a series of complex aerobic and anaerobic transformations of S-containing molecules and plays a fundamental role in cellular and ecosystem-level processes, influencing biological carbon transfers and other biogeochemical cycles. Despite their importance, the microbial communities and metabolic pathways involved in these transformations remain poorly understood, especially for inorganic sulfur compounds of intermediate oxidation states (thiosulfate, tetrathionate, sulfite, polysulfides). Isolated and highly stratified, the extreme geochemical and environmental features of meromictic ice-capped Lake A, in the Canadian High Arctic, provided an ideal model ecosystem to resolve the distribution and metabolism of aquatic sulfur cycling microorganisms along redox and salinity gradients. Results: Applying complementary molecular approaches, we identified sharply contrasting microbial communities and metabolic potentials among the markedly distinct water layers of Lake A, with similarities to diverse fresh, brackish and saline water microbiomes. Sulfur cycling genes were abundant at all depths and covaried with bacterial abundance. Genes for oxidative processes occurred in samples from the oxic freshwater layers, reductive reactions in the anoxic and sulfidic bottom waters and genes for both transformations at the chemocline. Up to 154 different genomic bins with potential for sulfur transformation were recovered, revealing a panoply of taxonomically diverse microorganisms with complex metabolic pathways for biogeochemical sulfur reactions. Genes for the utilization of sulfur cycle intermediates were widespread throughout the water column, co-occurring with sulfate reduction or sulfide oxidation pathways. The genomic bin composition suggested that in addition to chemical oxidation, these intermediate sulfur compounds were likely produced by the predominant sulfur chemo- and photo-oxidisers at the chemocline and by diverse microbial degraders of organic sulfur molecules. Conclusions: The Lake A microbial ecosystem provided an ideal opportunity to identify new features of the biogeochemical sulfur cycle. Our detailed metagenomic analyses across the broad physico-chemical gradients of this permanently stratified lake extend the known diversity of microorganisms involved in sulfur transformations over a wide range of environmental conditions. The results indicate that sulfur cycle intermediates and organic sulfur molecules are major sources of electron donors and acceptors for aquatic and sedimentary microbial communities in association with the classical sulfur cycl

CorpusUL