Search CORE

4 research outputs found

SMaSH: A Benchmarking Toolkit for Human Genome Variant Calling

Author: Bresler Ma'ayan
Curtis Kristal
Hartl Christopher
Jordan Michael I.
Liptrap Jesse
Newcomb Julie
Patterson David
Song Yun S.
Talwalkar Ameet
Terhorst Jonathan
Publication venue
Publication date: 05/01/2014
Field of study

Motivation: Computational methods are essential to extract actionable information from raw sequencing data, and to thus fulfill the promise of next-generation sequencing technology. Unfortunately, computational tools developed to call variants from human sequencing data disagree on many of their predictions, and current methods to evaluate accuracy and computational performance are ad-hoc and incomplete. Agreement on benchmarking variant calling methods would stimulate development of genomic processing tools and facilitate communication among researchers. Results: We propose SMaSH, a benchmarking methodology for evaluating human genome variant calling algorithms. We generate synthetic datasets, organize and interpret a wide range of existing benchmarking data for real genomes, and propose a set of accuracy and computational performance metrics for evaluating variant calling methods on this benchmarking data. Moreover, we illustrate the utility of SMaSH to evaluate the performance of some leading single nucleotide polymorphism (SNP), indel, and structural variant calling algorithms. Availability: We provide free and open access online to the SMaSH toolkit, along with detailed documentation, at smash.cs.berkeley.edu

arXiv.org e-Print Archive

Crossref

PubMed Central

eScholarship - University of California

Recommended from our members

SM a SH: a benchmarking toolkit for human genome variant calling

Author: Bresler Ma'ayan
Curtis Kristal
Hartl Christopher
Jordan Michael I
Liptrap Jesse
Newcomb Julie
Patterson David
Song Yun S
Talwalkar Ameet
Terhorst Jonathan
Publication venue: eScholarship, University of California
Publication date: 01/10/2014
Field of study

MotivationComputational methods are essential to extract actionable information from raw sequencing data, and to thus fulfill the promise of next-generation sequencing technology. Unfortunately, computational tools developed to call variants from human sequencing data disagree on many of their predictions, and current methods to evaluate accuracy and computational performance are ad hoc and incomplete. Agreement on benchmarking variant calling methods would stimulate development of genomic processing tools and facilitate communication among researchers.ResultsWe propose SMaSH, a benchmarking methodology for evaluating germline variant calling algorithms. We generate synthetic datasets, organize and interpret a wide range of existing benchmarking data for real genomes and propose a set of accuracy and computational performance metrics for evaluating variant calling methods on these benchmarking data. Moreover, we illustrate the utility of SMaSH to evaluate the performance of some leading single-nucleotide polymorphism, indel and structural variant calling algorithms.Availability and implementationWe provide free and open access online to the SMaSH tool kit, along with detailed documentation, at smash.cs.berkeley.ed

eScholarship - University of California

SM a SH: a benchmarking toolkit for human genome variant calling

Author: Albers
Alkan
Ameet Talwalkar
Chen
Christopher Hartl
Church
David Patterson
DePristo
Earl
Frazer
Gnerre
Jesse Liptrap
Jonathan Terhorst
Julie Newcomb
Kedes
Kidd
Kidd
Kristal Curtis
Levy
Li
Lyon
Mardis
Ma’ayan Bresler
Michael I. Jordan
Nekrutenko
Patterson
The 1000 Genomes Project Consortium
The HapMap Consortium
Yalcin
Ye
Yun S. Song
Zook
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref