Search CORE

499 research outputs found

Combining calls from multiple somatic mutation-callers

Author: A Roth
CT Saunders
DE Larson
DH Wolpert
H Li
J Ding
J Friedman
J Sill
K Cibulskis
L Breiman
Laurent Jacob
M Lower
NF Hansen
R Tibshirani
Su Yeon Kim
T Hastie
Terence P Speed
The Cancer Genome Atlas Research Network
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

SNPredict: A Machine Learning Approach for Detecting Low Frequency Variants in Cancer

Author: Mehra Vatsal
Publication venue: e-Publications@Marquette
Publication date: 01/07/2016
Field of study

Cancer is a genetic disease caused by the accumulation of DNA variants such as single nucleotide changes or insertions/deletions in DNA. DNA variants can cause silencing of tumor suppressor genes or increase the activity of oncogenes. In order to come up with successful therapies for cancer patients, these DNA variants need to be identified accurately. DNA variants can be identified by comparing DNA sequence of tumor tissue to a non-tumor tissue by using Next Generation Sequencing (NGS) technology. But the problem of detecting variants in cancer is hard because many of these variant occurs only in a small subpopulation of the tumor tissue. It becomes a challenge to distinguish these low frequency variants from sequencing errors, which are common in today\u27s NGS methods. Several algorithms have been made and implemented as a tool to identify such variants in cancer. However, it has been previously shown that there is low concordance in the results produced by these tools. Moreover, the number of false positives tend to significantly increase when these tools are faced with low frequency variants. This study presents SNPredict, a single nucleotide polymorphism (SNP) detection pipeline that aims to utilize the results of multiple variant callers to produce a consensus output with higher accuracy than any of the individual tool with the help of machine learning techniques. By extracting features from the consensus output that describe traits associated with an individual variant call, it creates binary classifiers that predict a SNP’s true state and therefore help in distinguishing a sequencing error from a true variant

epublications@Marquette

VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research

Author: Ahdesmaki Miika
Barrett J. Carl
Chapman Brad
Dougherty Brian
Dry Jonathan R.
Hofmann Oliver
Johnson Justin
Lai Zhongwu
Markovets Aleksandra
McEwen Robert
Publication venue: 'Oxford University Press (OUP)'
Publication date: 07/04/2016
Field of study

Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research

PubMed Central

Enlighten

University of Melbourne Institutional Repository

GenomeVIP: A cloud platform for genomic variant discovery and interpretation

Author: Chen Ken
DeNardo Erin
Ding Li
Fenyö David
Handsaker Robert E
Huang Kuan-lin
Koboldt Daniel C
Mashl R. Jay
Niu Beifang
Raphael Benjamin J
Scott Adam D
Wendl Michael C
Wyczalkowski Matthew A
Ye Kai
Yellapantula Venkata D
Yoon Christopher J
Publication venue: Digital Commons@Becker
Publication date: 01/01/2017
Field of study

Digital Commons@Becker

An ensemble approach to accurately detect somatic mutations using SomaticSeq

Author: Afshar Pegah Tootoonchi
Asadi Narges Bani
Barr Sharon
Chhibber Aparna
Fan Yu
Fang Li Tai
Gerstein Mark B
Gibeling Greg
Koboldt Daniel C
Lam Hugo YK
Mohiyuddin Marghoob
Mu John C
Wang Wenyi
Wong Wing H
Publication venue: Digital Commons@Becker
Publication date: 01/01/2015
Field of study

SomaticSeq is an accurate somatic mutation detection pipeline implementing a stochastic boosting algorithm to produce highly accurate somatic mutation calls for both single nucleotide variants and small insertions and deletions. The workflow currently incorporates five state-of-the-art somatic mutation callers, and extracts over 70 individual genomic and sequencing features for each candidate site. A training set is provided to an adaptively boosted decision tree learner to create a classifier for predicting mutation statuses. We validate our results with both synthetic and real data. We report that SomaticSeq is able to achieve better overall accuracy than any individual tool incorporated. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-015-0758-2) contains supplementary material, which is available to authorized users

Digital Commons@Becker

PubMed Central

Development of computational tools for variant calling in single-cell RNAseq

Author: Zielińska Kinga Anna
Publication venue
Publication date: 05/06/2023
Field of study

Single-cell sequencing technologies have unsurprisingly become a favourable choice for studying key biological questions about cell heterogeneity, rare cell types or lineages. It is only cell-level resolution that allows for an accurate analysis of internal cell processes such as mutagenesis. Eventually, single-cell RNAseq could provide an explanation of mechanisms that lead to the ultimate transformation of healthy tissues into cancerous lesions. One of the main interests of my lab is Barrett’s oesophagus. It is a highly clonal disease and a likely cancer precursor. We decided to take advantage of the single-cell RNAseq technology in order to attempt to identify the tissue of origin of the disease which, despite years of research, still remains unknown. However, the range of methods for identification of mutations in single cells is very limited. In order to address that, we developed our own single-cell RNAseq variant caller. We validated it on a publicly available breast cancer dataset by achieving a reasonable intersection of our results with the output of commonly used bulk tools. Furthermore, we showed that our caller was capable of identifying expected data characteristics such as known breast cancer signatures and mutations in breast cancer genes. We then applied our method to the Barrett’s dataset to investigate connections of Barrett’s with surrounding tissues. Contrary to the previous transcriptomic analysis conducted on the same dataset and indicating a Barrett’s-oesophagus connection, our results revealed a more likely link of Barrett’s with the stomach

Oxford University Research Archive

Characterization and identification of hidden rare variants in the human genome

Author: Abbate Rosanna
Cifola Ingrid
D\u27Aurizio Romina
Gensini Gian Franco
Giusti Betti
Magi Alberto
Palombo Flavia
Pippucci Tommaso
Romeo Giovanni
Semeraro Roberto
Tattini Lorenzo
Publication venue: BIOMED CENTRAL
Publication date: 01/01/2015
Field of study

BackgroundBy examining the genotype calls generated by the 1000 Genomes Project we discovered that the human reference genome GRCh37 contains almost 20,000 loci in which the reference allele has never been observed in healthy individuals and around 70,000 loci in which it has been observed only in the heterozygous state.ResultsWe show that a large fraction of this rare reference allele (RRA) loci belongs to coding, functional and regulatory elements of the genome and could be linked to rare Mendelian disorders as well as cancer. We also demonstrate that classical germline and somatic variant calling tools are not capable to recognize the rare allele when present in these loci. To overcome such limitations, we developed a novel tool, named RAREVATOR, that is able to identify and call the rare allele in these genomic positions. By using a small cancer dataset we compared our tool with two state-of-the-art callers and we found that RAREVATOR identified more than 1,500 germline and 22 somatic RRA variants missed by the two methods and which belong to significantly mutated pathways.ConclusionsThese results show that, to date, the investigation of around 100,000 loci of the human genome has been missed by re-sequencing experiments based on the GRCh37 assembly and that our tool can fill the gap left by other methods. Moreover, the investigation of the latest version of the human reference genome, GRCh38, showed that although the GRC corrected almost all insertions and a small part of SNVs and deletions, a large number of functionally relevant RRAs still remain unchanged. For this reason, also future resequencing experiments, based on GRCh38, will benefit from RAREVATOR analysis results. RAREVATOR is freely available at http://sourceforge.net/projects/rarevator

Springer - Publisher Connector

Florence Research

PubMed Central

PUblication MAnagement

A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.

Author: Alioto Tyler S
Anderson Charlotte L
Beck Timothy A
Beltran Sergi
Boutros Paul C
Bower Lawrence
Brors Benedikt
Buchhalter Ivo
Butler Adam P
Campbell Peter J
Campo Elías
Chotewutmontri Sasithorn
Dabad Marc
Denroche Robert E
Derdak Sophia
Diessl Nicolle
Drews Ruben
Eils Roland
Eldridge Matthew D
Feuerbach Lars
Fujimoto Akihiro
Gerhard Daniela S
Giner Francesc Castro
Ginsbach Philip
Grimmond Sean M
Gröbner Susanne
Gut Ivo G
Gut Marta
Harding Nicholas J
He Minghui
Heath Simon C
Heinold Michael
Heisler Lawrence E
Hinton Jonathan
Hovig Eivind
Hudson Thomas J
Hutter Barbara
Jones David
Jones David TW
Jäger Natalie
Kabbe Rolf
Kandoth Cyriac
Korshunov Andrey
Lee Semin
Lichter Peter
Lynch Andrew G
Létourneau Louis
López-Otín Carlos
Ma Singer
McPherson John D
Menzies Andrew
Nakagawa Hidewaki
Nakken Sigve
Paramasivam Nagarajan
Patch Ann-Marie
Pearson John V
Peto Myron
Pfister Stefan M
Previti Christopher
Puente Xose S
Quesada Víctor
Raine Keiran
Raineri Emanuele
Ribeca Paolo
Schlesner Matthias
Schmidt Sabine
Sertier Anne-Sophie
Seth Sahil
Shepherd Rebecca
Simpson Jared T
Spellman Paul
Stebbings Lucy
Tarpey Patrick S
Teague Jon W
Tonon Laurie
Torrents David
Valdés-Mas Rafael
Vodák Daniel
Waddell Nicola
Wheeler David A
Xi Liu
Yamaguchi Takafumi N
Zhang John
Publication venue: Nat Commun
Publication date: 01/01/2015
Field of study

As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼ 100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.We thank the DKFZ Genomics and Proteomics Core Facility and the OICR Genome Technologies Platform for provision of sequencing services. Financial support was provided by the consortium projects READNA under grant agreement FP7 Health-F4-2008-201418, ESGI under grant agreement 262055, GEUVADIS under grant agreement 261123 of the European Commission Framework Programme 7, ICGC-CLL through the Spanish Ministry of Science and Innovation (MICINN), the Instituto de Salud Carlos III (ISCIII) and the Generalitat de Catalunya. Additional financial support was provided by the PedBrain Tumor Project contributing to the International Cancer Genome Consortium, funded by German Cancer Aid (109252) and by the German Federal Ministry of Education and Research (BMBF, grants #01KU1201A, MedSys #0315416C and NGFNplus #01GS0883; the Ontario Institute for Cancer Research to PCB and JDM through funding provided by the Government of Ontario, Ministry of Research and Innovation; Genome Canada; the Canada Foundation for Innovation and Prostate Cancer Canada with funding from the Movember Foundation (PCB). PCB was also supported by a Terry Fox Research Institute New Investigator Award, a CIHR New Investigator Award and a Genome Canada Large-Scale Applied Project Contract. The Synergie Lyon Cancer platform has received support from the French National Institute of Cancer (INCa) and from the ABS4NGS ANR project (ANR-11-BINF-0001-06). The ICGC RIKEN study was supported partially by RIKEN President’s Fund 2011, and the supercomputing resource for the RIKEN study was provided by the Human Genome Center, University of Tokyo. MDE, LB, AGL and CLA were supported by Cancer Research UK, the University of Cambridge and Hutchison-Whampoa Limited. SD is supported by the Torres Quevedo subprogram (MI CINN) under grant agreement PTQ-12-05391. EH is supported by the Research Council of Norway under grant agreements 221580 and 218241 and by the Norwegian Cancer Society under grant agreement 71220-PR-2006-0433. Very special thanks go to Jennifer Jennings for administrating the activity of the ICGC Verification Working Group and Anna Borrell for administrative support.This is the final version of the article. It first appeared from Nature Publishing Group via http://dx.doi.org/10.1038/ncomms1000

OPUS Augsburg

Repositorio Institucional de la Universidad de Oviedo

ScholarWorks@UNIST

Enlighten

UPCommons. Portal del coneixement obert de la UPC

Harvard University - DASH

PubMed Central

eScholarship - University of California

Apollo (Cambridge)

Diposit Digital de la Universitat de Barcelona

University of St. Andrews - Pure

St Andrews Research Repository

multiSNV: a probabilistic approach for improving detection of somatic point mutations from multiple related tumour samples.

Author: Josephidou Malvina
Lynch Andy G
Tavaré Simon
Publication venue: Nucleic Acids Res
Publication date: 26/02/2015
Field of study

Somatic variant analysis of a tumour sample and its matched normal has been widely used in cancer research to distinguish germline polymorphisms from somatic mutations. However, due to the extensive intratumour heterogeneity of cancer, sequencing data from a single tumour sample may greatly underestimate the overall mutational landscape. In recent studies, multiple spatially or temporally separated tumour samples from the same patient were sequenced to identify the regional distribution of somatic mutations and study intratumour heterogeneity. There are a number of tools to perform somatic variant calling from matched tumour-normal next-generation sequencing (NGS) data; however none of these allow joint analysis of multiple same-patient samples. We discuss the benefits and challenges of multisample somatic variant calling and present multiSNV, a software package for calling single nucleotide variants (SNVs) using NGS data from multiple same-patient samples. Instead of performing multiple pairwise analyses of a single tumour sample and a matched normal, multiSNV jointly considers all available samples under a Bayesian framework to increase sensitivity of calling shared SNVs. By leveraging information from all available samples, multiSNV is able to detect rare mutations with variant allele frequencies down to 3% from whole-exome sequencing experiments.Cancer Research UK grant C14303/A17197. Funding for open access charge: University of Cambridge.This is the final published version. It first appeared at http://nar.oxfordjournals.org/content/early/2015/02/26/nar.gkv135.long

PubMed Central

Apollo (Cambridge)

University of St. Andrews - Pure

multiSNV : a probabilistic approach for improving detection of somatic point mutations from multiple related tumour samples

Author: Josephidou Malvina
Lynch Andy G.
Tavaré Simon
Publication venue: 'Oxford University Press (OUP)'
Publication date: 14/08/2017
Field of study

Funding: Cancer Research UK grant C14303/A17197. Funding for open access charge: University of Cambridge.Somatic variant analysis of a tumour sample and its matched normal has been widely used in cancer research to distinguish germline polymorphisms from somatic mutations. However, due to the extensive intratumour heterogeneity of cancer, sequencing data from a single tumour sample may greatly underestimate the overall mutational landscape. In recent studies, multiple spatially or temporally separated tumour samples from the same patient were sequenced to identify the regional distribution of somatic mutations and study intratumour heterogeneity. There are a number of tools to perform somatic variant calling from matched tumour-normal next-generation sequencing (NGS) data; however none of these allow joint analysis of multiple same-patient samples. We discuss the benefits and challenges of multisample somatic variant calling and present multiSNV, a software package for calling single nucleotide variants (SNVs) using NGS data from multiple same-patient samples. Instead of performing multiple pairwise analyses of a single tumour sample and a matched normal, multiSNV jointly considers all available samples under a Bayesian framework to increase sensitivity of calling shared SNVs. By leveraging information from all available samples, multiSNV is able to detect rare mutations with variant allele frequencies down to 3% from whole-exome sequencing experiments.Publisher PDFPeer reviewe

St Andrews Research Repository