Additional file 2: Figure S2. of tarSVM: Improving the accuracy of variant calls derived from microfluidic PCR-based targeted next generation sequencing using a support vector machine

Abstract

Comparison of quality by depth (QD) from CAKUT and ExAC. Sites that were filtered out of ExAC (VQSR filtering) are shown in red. Sites that were marked as “PASS” in ExAC are shown in gold. Sites that were Not Sanger validated are shown in green. Sites that were Sanger Validated are shown in teal. Sites filtered by tarSVM are shown in blue. Sites that passed tarSVM are shown in purple. Quality by depth is correlated with mean allele balance, as is being used as a proxy for it. It is clear that there is a very clear separation between variants that are filtered by tarSVM and variants that pass tarSVM. Most of the variants filtered by tarSVM have a much lower quality than the pass variants. Variants that are Sanger validated are stronger correlated with variants that pass tarSVM. Variants that are labeled “PASS” in ExAC have a higher variant quality that the microfluidic data. The filtered variants in ExAC have a more flat distribution that those filtered by tarSVM. It is important to note, the variants that underwent Sanger sequencing were selected because they had the characteristics of true variants. This is why there is so much overlap between the distributions for Sanger validated variants and Not Sanger validated variants. (PPTX 121 kb

    Similar works