Search CORE

11 research outputs found

CAMISIM: Simulating metagenomes and microbial communities

Author: Belmann P
Bremges A
Dahms E
Darling AE
Demaere MZ
Dröge J
Fiedler J
Fritz A
Hofmann P
Lesker TR
Majda S
McHardy AC
Sczyrba A
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/04/2018
Field of study

© 2019 The Author(s). Background: Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to differences in laboratory protocols, replicate numbers, and sequencing technologies. Accordingly, to effectively assess the performance of metagenomic analysis software, a wide range of benchmark data sets are required. Results: We describe the CAMISIM microbial community and metagenome simulator. The software can model different microbial abundance profiles, multi-sample time series, and differential abundance studies, includes real and simulated strain-level diversity, and generates second- and third-generation sequencing data from taxonomic profiles or de novo. Gold standards are created for sequence assembly, genome binning, taxonomic binning, and taxonomic profiling. CAMSIM generated the benchmark data sets of the first CAMI challenge. For two simulated multi-sample data sets of the human and mouse gut microbiomes, we observed high functional congruence to the real data. As further applications, we investigated the effect of varying evolutionary genome divergence, sequencing depth, and read error profiles on two popular metagenome assemblers, MEGAHIT, and metaSPAdes, on several thousand small data sets generated with CAMISIM. Conclusions: CAMISIM can simulate a wide variety of microbial communities and metagenome data sets together with standards of truth for method evaluation

Helmholtz Zentrum für Infektionsforschung Repository

OPUS - University of Technology Sydney

Directory of Open Access Journals

Publications at Bielefeld University

Critical Assessment of Metagenome Interpretation:A benchmark of metagenomics software

Author: A Mikheenko
Aaron E Darling
Adrian Fritz
Alexander Sczyrba
Alexey Gurevich
Alice C McHardy
Andreas Bremges
B Liu
Bernhard Y Renard
Bertrand Denis
Burton K H Chia
C Lozupone
Charles Deltel
Chirag Jain
Christopher Quince
Claire Lemaitre
D Coil
D Koslicki
D Koslicki
D Koslicki
D Li
D Turaev
Daniel A Cuevas
David Koslicki
DD Kang
DE Wood
DH Huson
Dmitrij Turaev
Dominique Lavenier
Dongwan Don Kang
E Pruesse
Edward M Rubin
Eik Dahms
Fernando Meyer
Genivaldo Gueiros Z Silva
GG Silva
Guillaume Rizk
H Klingenberg
Hans-Peter Klenk
Heiner Klingenberg
HH Lin
Hsin-Hung Lin
I Gregor
Ivan Gregor
J Alneberg
J Dröge
JA Chapman
Jeff L Froula
Jeffrey J Cook
Jessika Fiedler
Johannes Dröge
Julia A Vorholt
K Mavromatis
KT Konstantinidis
Lars Hestbjerg Hansen
M Arumugam
M Balvočiūtė
M Strous
M Yassour
Marc Strous
Markus Göker
Matthew Z DeMaere
Michael Beckstette
Michael D Barton
Mihai Pop
ML Bendall
Monika Balvočiūtė
N Kashtan
N Sangwan
N Segata
Nicole Shapiro
Nikos C Kyrpides
Niranjan Nagarajan
NP Nguyen
O Koren
P Belmann
Paul Schulze-Lefert
Peter Belmann
Peter Hofmann
Peter Meinicke
Philip D Blood
Pierre Peterlongo
R Chikhi
R Ounit
Rayan Chikhi
Robert A Edwards
Robert Egan
RR Miller
Ruben Garrido-Oter
S Boisvert
S Chatterjee
S Gao
S Lindgreen
S Sunagawa
Stefan Janssen
Stephan Majda
Steven W Singer
Surya Saha
Søren J Sørensen
T Thomas
Tanja Woyke
Thomas Lingner
Thomas Rattei
Tue Sparholt Jørgensen
V Marx
VC Piro
Vitor C Piro
Y Bai
Yang Bai
Yu-Chieh Liao
Yu-Wei Wu
YW Wu
Zhong Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

International audienceIn metagenome analysis, computational methods for assembly, taxonomic profilingand binning are key components facilitating downstream biological datainterpretation. However, a lack of consensus about benchmarking datasets andevaluation metrics complicates proper performance assessment. The CriticalAssessment of Metagenome Interpretation (CAMI) challenge has engaged the globaldeveloper community to benchmark their programs on datasets of unprecedentedcomplexity and realism. Benchmark metagenomes were generated from newlysequenced ~700 microorganisms and ~600 novel viruses and plasmids, includinggenomes with varying degrees of relatedness to each other and to publicly availableones and representing common experimental setups. Across all datasets, assemblyand genome binning programs performed well for species represented by individualgenomes, while performance was substantially affected by the presence of relatedstrains. Taxonomic profiling and binning programs were proficient at high taxonomicranks, with a notable performance decrease below the family level. Parametersettings substantially impacted performances, underscoring the importance ofprogram reproducibility. While highlighting current challenges in computationalmetagenomics, the CAMI results provide a roadmap for software selection to answerspecific research questions

Roskilde Universitet

HAL Descartes

Warwick Research Archives Portal Repository

MPG.PuRe

Hal-Diderot

Repository for Publications and Research Data

Crossref

National Health Research Institues

OPUS - University of Technology Sydney

INRIA a CCSD electronic archive server

Copenhagen University Research Information System

eScholarship - University of California