Search CORE

598 research outputs found

Species abundance information improves sequence taxonomy classification accuracy.

Author: Bokulich Nicholas A
Caporaso J Gregory
Huttley Gavin A
Kaehler Benjamin D
Knight Rob
McDonald Daniel
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

Popular naive Bayes taxonomic classifiers for amplicon sequences assume that all species in the reference database are equally likely to be observed. We demonstrate that classification accuracy degrades linearly with the degree to which that assumption is violated, and in practice it is always violated. By incorporating environment-specific taxonomic abundance information, we demonstrate a significant increase in the species-level classification accuracy across common sample types. At the species level, overall average error rates decline from 25% to 14%, which is favourably comparable to the error rates that existing classifiers achieve at the genus level (16%). Our findings indicate that for most practical purposes, the assumption that reference species are equally likely to be observed is untenable. q2-clawback provides a straightforward alternative for samples from common environments

Repository for Publications and Research Data

eScholarship - University of California

Associations among Wine Grape Microbiome, Metabolome, and Fermentation Behavior Suggest Microbial Contribution to Regional Wine Characteristics.

Author: Allen Greg
Bokulich Nicholas A
Collins Thomas S
Ebeler Susan E
Heymann Hildegarde
Masarweh Chad
Mills David A
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

UnlabelledRegionally distinct wine characteristics (terroir) are an important aspect of wine production and consumer appreciation. Microbial activity is an integral part of wine production, and grape and wine microbiota present regionally defined patterns associated with vineyard and climatic conditions, but the degree to which these microbial patterns associate with the chemical composition of wine is unclear. Through a longitudinal survey of over 200 commercial wine fermentations, we demonstrate that both grape microbiota and wine metabolite profiles distinguish viticultural area designations and individual vineyards within Napa and Sonoma Counties, California. Associations among wine microbiota and fermentation characteristics suggest new links between microbiota, fermentation performance, and wine properties. The bacterial and fungal consortia of wine fermentations, composed from vineyard and winery sources, correlate with the chemical composition of the finished wines and predict metabolite abundances in finished wines using machine learning models. The use of postharvest microbiota as an early predictor of wine chemical composition is unprecedented and potentially poses a new paradigm for quality control of agricultural products. These findings add further evidence that microbial activity is associated with wine terroirImportanceWine production is a multi-billion-dollar global industry for which microbial control and wine chemical composition are crucial aspects of quality. Terroir is an important feature of consumer appreciation and wine culture, but the many factors that contribute to terroir are nebulous. We show that grape and wine microbiota exhibit regional patterns that correlate with wine chemical composition, suggesting that the grape microbiome may influence terroir In addition to enriching our understanding of how growing region and wine properties interact, this may provide further economic incentive for agricultural and enological practices that maintain regional microbial biodiversity

Repository for Publications and Research Data

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Explanatory fictions—for real?

Author: A Bokulich
A Bokulich
A Bokulich
A Bokulich
B Kment
C Hitchcock
C Hitchcock
D Harker
D Lewis
J Dunn
J Woodward
R Wasserman
Samuel Schindler
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking.

Author: Arron Shiffer
Benjamin Wolfe
Corinne F. Maurice
J. Gregory Caporaso
Jai Ram Rideout
Josh D. Neufeld
Nicholas A. Bokulich
Peter J. Turnbaugh
Rachel J. Dutton
Rob Knight
William G. Mercurio
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. IMPORTANCE The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community

Repository for Publications and Research Data

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Species abundance information improves sequence taxonomy classification accuracy

Author: Bokulich Nicholas A
Caporaso J Gregory
Huttley Gavin
Kaehler Benjamin
Knight Rob
McDonald Daniel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2019
Field of study

Repository for Publications and Research Data

eScholarship - University of California

The Australian National University

Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences

Author: Benjamin D. Kaehler
Michal Ziemski
Nicholas A. Bokulich
Treepop Wisanwanichthan
Publication venue: 'Frontiers Media SA'
Publication date: 01/06/2021
Field of study

Naive Bayes classifiers (NBC) have dominated the field of taxonomic classification of amplicon sequences for over a decade. Apart from having runtime requirements that allow them to be trained and used on modest laptops, they have persistently provided class-topping classification accuracy. In this work we compare NBC with random forest classifiers, neural network classifiers, and a perfect classifier that can only fail when different species have identical sequences, and find that in some practical scenarios there is little scope for improving on NBC for taxonomic classification of 16S rRNA gene sequences. Further improvements in taxonomy classification are unlikely to come from novel algorithms alone, and will need to leverage other technological innovations, such as ecological frequency information

Directory of Open Access Journals

Explanatory integration

Author: A Bokulich
A Bokulich
A Bokulich
E McMullin
J Kim
J Woodward
M Elgin
M Friedman
M Strevens
P Kitcher
R Batterman
R Giere
R Healey
RW Batterman
RW Batterman
RW Batterman
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref