13 research outputs found
Recommended from our members
The impact of tumor profiling approaches and genomic data strategies for cancer precision medicine
Background: The diversity of clinical tumor profiling approaches (small panels to whole exomes with matched or unmatched germline analysis) may engender uncertainty about their benefits and liabilities, particularly in light of reported germline false positives in tumor-only profiling and use of global mutational and/or neoantigen data. The goal of this study was to determine the impact of genomic analysis strategies on error rates and data interpretation across contexts and ancestries. Methods: We modeled common tumor profiling modalities—large (n = 300 genes), medium (n = 48 genes), and small (n = 15 genes) panels—using clinical whole exomes (WES) from 157 patients with lung or colon adenocarcinoma. We created a tumor-only analysis algorithm to assess germline false positive rates, the impact of patient ancestry on tumor-only results, and neoantigen detection. Results: After optimizing a germline filtering strategy, the germline false positive rate with tumor-only large panel sequencing was 14 % (144/1012 variants). For patients whose tumor-only results underwent molecular pathologist review (n = 91), 50/54 (93 %) false positives were correctly interpreted as uncertain variants. Increased germline false positives were observed in tumor-only sequencing of non-European compared with European ancestry patients (p < 0.001; Fisher’s exact) when basic germline filtering approaches were used; however, the ExAC database (60,706 germline exomes) mitigated this disparity (p = 0.53). Matched and unmatched large panel mutational load correlated with WES mutational load (r2 = 0.99 and 0.93, respectively; p < 0.001). Neoantigen load also correlated (r2 = 0.80; p < 0.001), though WES identified a broader spectrum of neoantigens. Small panels did not predict mutational or neoantigen load. Conclusions: Large tumor-only targeted panels are sufficient for most somatic variant identification and mutational load prediction if paired with expanded germline analysis strategies and molecular pathologist review. Paired germline sequencing reduced overall false positive mutation calls and WES provided the most neoantigens. Without patient-matched germline data, large germline databases are needed to minimize false positive mutation calling and mitigate ethnic disparities. Electronic supplementary material The online version of this article (doi:10.1186/s13073-016-0333-9) contains supplementary material, which is available to authorized users
BreaKmer: detection of structural variation in targeted massively parallel sequencing data using kmers
Genomic structural variation (SV), a common hallmark of cancer, has important predictive and therapeutic implications. However, accurately detecting SV using high-throughput sequencing data remains challenging, especially for ‘targeted’ resequencing efforts. This is critically important in the clinical setting where targeted resequencing is frequently being applied to rapidly assess clinically actionable mutations in tumor biopsies in a cost-effective manner. We present BreaKmer, a novel approach that uses a ‘kmer’ strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants. Relative to four publically available algorithms, BreaKmer detected SV with increased sensitivity and limited calls in non-tumor samples, key features for variant analysis of tumor specimens in both the clinical and research settings
Additional file 1: Table S1. of The impact of tumor profiling approaches and genomic data strategies for cancer precision medicine
Large panel genes. (DOCX 29 kb
Additional file 8: Figure S1. of The impact of tumor profiling approaches and genomic data strategies for cancer precision medicine
Mutational load predictions with different panel tests for the colon adenocarcinoma subset. Comparison of mutational load predictions using WES or either matched (a) or unmatched (b) large panel tests (n = 300 genes) demonstrates both can reliably predict the mutational load. The linear regression line is shown in black with 95 % confidence bands shaded in grey. The identity line (dashed) is shown for comparison. With medium sized panels (n = 48 genes), this ability decreases in both the matched and unmatched setting and is not possible with small (n = 15) gene panels. Note that hypermutated tumors were excluded from the regression analysis. (PDF 809 kb
Additional file 4: Table S4. of The impact of tumor profiling approaches and genomic data strategies for cancer precision medicine
All matched mutation calls (due to size, see Additional file 5: Table S5 for unmatched mutation calls). (XLSX 3746 kb
Additional file 5: Table S5. of The impact of tumor profiling approaches and genomic data strategies for cancer precision medicine
All unmatched mutation calls (due to size, due to size, see Additional file 4: Table S4 for matched mutation calls). (XLSX 11186 kb
Additional file 6: Table S6. of The impact of tumor profiling approaches and genomic data strategies for cancer precision medicine
Sample sequencing metrics. (DOCX 27 kb
Additional file 9: Figure S2. of The impact of tumor profiling approaches and genomic data strategies for cancer precision medicine
Mutational load predictions with different panel tests for the lung adenocarcinoma subset. Comparison of mutational load predictions using WES or either matched (a) or unmatched (b) large panel tests (n = 300 genes) demonstrates both can reliably predict the mutational load. The linear regression line is shown in black with 95 % confidence bands shaded in grey. The identity line (dashed) is shown for comparison. With medium sized panels (n = 48 genes), this ability decreases in both the matched and unmatched setting and is not possible with small (n = 15) gene panels. (PDF 848 kb