Search CORE

541,684 research outputs found

SMaSH: A Benchmarking Toolkit for Human Genome Variant Calling

Author: Bresler Ma'ayan
Curtis Kristal
Hartl Christopher
Jordan Michael I.
Liptrap Jesse
Newcomb Julie
Patterson David
Song Yun S.
Talwalkar Ameet
Terhorst Jonathan
Publication venue
Publication date: 05/01/2014
Field of study

Motivation: Computational methods are essential to extract actionable information from raw sequencing data, and to thus fulfill the promise of next-generation sequencing technology. Unfortunately, computational tools developed to call variants from human sequencing data disagree on many of their predictions, and current methods to evaluate accuracy and computational performance are ad-hoc and incomplete. Agreement on benchmarking variant calling methods would stimulate development of genomic processing tools and facilitate communication among researchers. Results: We propose SMaSH, a benchmarking methodology for evaluating human genome variant calling algorithms. We generate synthetic datasets, organize and interpret a wide range of existing benchmarking data for real genomes, and propose a set of accuracy and computational performance metrics for evaluating variant calling methods on this benchmarking data. Moreover, we illustrate the utility of SMaSH to evaluate the performance of some leading single nucleotide polymorphism (SNP), indel, and structural variant calling algorithms. Availability: We provide free and open access online to the SMaSH toolkit, along with detailed documentation, at smash.cs.berkeley.edu

arXiv.org e-Print Archive

Crossref

PubMed Central

eScholarship - University of California

VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research

Author: Ahdesmaki Miika
Barrett J. Carl
Chapman Brad
Dougherty Brian
Dry Jonathan R.
Hofmann Oliver
Johnson Justin
Lai Zhongwu
Markovets Aleksandra
McEwen Robert
Publication venue: 'Oxford University Press (OUP)'
Publication date: 07/04/2016
Field of study

Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research

Crossref

PubMed Central

Enlighten

University of Melbourne Institutional Repository

A Labelled Sequent Calculus for BBI: Proof Theory and Proof Search

Author: Gore Rajeev
Hou Zhe
Tiu Alwen
Publication venue
Publication date: 01/01/2015
Field of study

We present a labelled sequent calculus for Boolean BI, a classical variant of O'Hearn and Pym's logic of Bunched Implication. The calculus is simple, sound, complete, and enjoys cut-elimination. We show that all the structural rules in our proof system, including those rules that manipulate labels, can be localised around applications of certain logical rules, thereby localising the handling of these rules in proof search. Based on this, we demonstrate a free variable calculus that deals with the structural rules lazily in a constraint system. A heuristic method to solve the constraints is proposed in the end, with some experimental results

arXiv.org e-Print Archive

Crossref

DR-NTU (Digital Repository of NTU)

SVIM: Structural Variant Identification using Mapped Long Reads

Author: Heller D.
Vingron M.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/09/2019
Field of study

Motivation: Structural variants are defined as genomic variants larger than 50bp. They have been shown to affect more bases in any given genome than SNPs or small indels. Additionally, they have great impact on human phenotype and diversity and have been linked to numerous diseases. Due to their size and association with repeats, they are difficult to detect by shotgun sequencing, especially when based on short reads. Long read, single molecule sequencing technologies like those offered by Pacific Biosciences or Oxford Nanopore Technologies produce reads with a length of several thousand base pairs. Despite the higher error rate and sequencing cost, long read sequencing offers many advantages for the detection of structural variants. Yet, available software tools still do not fully exploit the possibilities. Results: We present SVIM, a tool for the sensitive detection and precise characterization of structural variants from long read data. SVIM consists of three components for the collection, clustering and combination of structural variant signatures from read alignments. It discriminates five different variant classes including similar types, such as tandem and interspersed duplications and novel element insertions. SVIM is unique in its capability of extracting both the genomic origin and destination of duplications. It compares favorably with existing tools in evaluations on simulated data and real datasets from PacBio and Nanopore sequencing machines. Availability and implementation: The source code and executables of SVIM are available on Github: github.com/eldariont/svim. SVIM has been implemented in Python 3 and published on bioconda and the Python Package Index. Supplementary information: Supplementary data are available at Bioinformatics online

Crossref

MPG.PuRe

Undecidability of Multiplicative Subexponential Logic

Author: Chaudhuri Kaustuv
Publication venue: 'Open Publishing Association'
Publication date: 13/07/2014
Field of study

Subexponential logic is a variant of linear logic with a family of exponential connectives--called subexponentials--that are indexed and arranged in a pre-order. Each subexponential has or lacks associated structural properties of weakening and contraction. We show that classical propositional multiplicative linear logic extended with one unrestricted and two incomparable linear subexponentials can encode the halting problem for two register Minsky machines, and is hence undecidable.Comment: In Proceedings LINEARITY 2014, arXiv:1502.0441

arXiv.org e-Print Archive

Crossref

Portail HAL Nantes Université

INRIA a CCSD electronic archive server

Directory of Open Access Journals

HAL: Hyper Article en Ligne

HAL-Polytechnique

On the strictness of the quantifier structure hierarchy in first-order logic

Author: He Yuguo
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 01/11/2014
Field of study

We study a natural hierarchy in first-order logic, namely the quantifier structure hierarchy, which gives a systematic classification of first-order formulas based on structural quantifier resource. We define a variant of Ehrenfeucht-Fraisse games that characterizes quantifier classes and use it to prove that this hierarchy is strict over finite structures, using strategy compositions. Moreover, we prove that this hierarchy is strict even over ordered finite structures, which is interesting in the context of descriptive complexity.Comment: 38 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Episciences.org

Directory of Open Access Journals

NLC-2 graph recognition and isomorphism

Author: B. Courcelle
C.P. Gabor
E. Dahlhaus
E. Dahlhaus
E. Wanke
F. Gurski
J.-L. Fouquet
M. Chein
M. Habib
R.M. McConnell
T. Gallai
W.H. Cunningham
W.H. Cunningham
Ö. Johansson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

NLC-width is a variant of clique-width with many application in graph algorithmic. This paper is devoted to graphs of NLC-width two. After giving new structural properties of the class, we propose a

O(n^2 m)

-time algorithm, improving Johansson's algorithm \cite{Johansson00}. Moreover, our alogrithm is simple to understand. The above properties and algorithm allow us to propose a robust

O(n^2 m)

-time isomorphism algorithm for NLC-2 graphs. As far as we know, it is the first polynomial-time algorithm.Comment: soumis \`{a} WG 2007; 12

arXiv.org e-Print Archive

Crossref

HAL Descartes

Hal-Diderot