26 research outputs found
Overview of the experimental workflow: Single HepG2 cells were isolated and whole genome amplification was performed using four commercially available kits.
The WGA products as well as two unamplified bulk samples were fragmented by sonication and sequencing libraries were prepared. Exome enrichment was performed after pooling of the libraries. Finally the enriched pools were sequenced and the resulting data analyzed.</p
Allelic dropout of SNVs called as heterozygous in the Bulk_1 sample.
Both normalized to the total number of variants that overlap with the Bulk_1 sample (a) and variant counts (b).</p
An overview of the implementation of alphabetr.
(A) From the population of interest, multiple samples of 10–300 T cells are sorted into 96-well plates. This design allows for a given clone to be sampled in multiple wells. (B) Multiplex RT-PCR is used to create cDNA libraries of CDR3α and CDR3β from each well, and (C) high-throughput sequencing is used to recover the unpaired CDR3α and CDR3β sequences of the clones sampled in each well. (D)(i) A random subset of the wells is chosen, (ii) association scores between every unique α and β found across the wells within this sample are calculated, and (iii) the set of unique αβ pairs that maximises the sum of association scores is identified using the Hungarian algorithm [39]. Step (iii) is illustrated for a particular set of CDR3α and CDR3β recovered from one well, as a matrix of association scores calculated across all wells in the subsample. (E) Steps D(i)-(iii) are repeated to generate a consensus list of pairs, filtering out candidates that appear rarely across replicates. (F) The frequencies of each remaining candidate αβ pair within the parent population are estimated using a maximum-likelihood approach, assuming only sharing (no dual TCR). Dual TCRα clones α1 α2 β1 are then distinguished from clones apparently sharing a TCRβ chain (α1 β1 and α2 β1), by examining the patterns of co-occurrences of the three chains, and the frequencies of these clones are re-calculated. (G) The output of the algorithm is a list of single and dual TCRα clones, each with their estimated frequency within the parent population. See text and Methods for more details.</p
Read pair mapping orientations (fwd, fwd; rev, fwd; or rev, rev) as percentage of total number of read pairs for each sample in the 10M read pair subset.
The right panel shows an enlarged version of the region from 90% to 100%. All samples showed >90% of read pairs mapping in the expected directions.</p
Depth and accuracy of <i>αβ</i> pairings generated by alphabetr, for a range of overall sample sizes, sampling strategies and underlying distributions of clone sizes.
<p>Simulations were performed using <i>in silico</i> data sets of one or five plates using six different sampling strategies (see text) and different degrees of skewness in clonal frequencies, as indicated by the number of clones comprising 50% of the population when ranked by frequency. ‘Threshold’ refers to the stringency of pair association, <i>T</i> (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005313#sec014" target="_blank">Methods</a>). <b>(A)</b> The proportion of the most abundant 50% of clones that were identified. <b>(B)</b> The proportion of the least abundant 50% of clones that were identified. <b>(C)</b> The overall depth was influenced strongly by the tail depth, indicating that data from one plate may be sufficient for recovering the most common clones. <b>(D)</b> The rate at which CDR3<i>α</i> and CDR3<i>β</i> sequences were incorrectly paired (false positive rate, FPR).</p
Comparison of well occupancy patterns of the clones identified by alphabetr and in ref. [22].
<p>For each method, TCR<i>αβ</i> pairs identified for all tumour samples were combined to estimate the distribution of the number of wells in which the chains co-appeared. The differences between these distributions indicate the relative efficiency with which the two algorithms identify clones, as a function of their abundance.</p
Comparison of whole genome amplification techniques for human single cell exome sequencing
<div><p>Background</p><p>Whole genome amplification (WGA) is currently a prerequisite for single cell whole genome or exome sequencing. Depending on the method used the rate of artifact formation, allelic dropout and sequence coverage over the genome may differ significantly.</p><p>Results</p><p>The largest difference between the evaluated protocols was observed when analyzing the target coverage and read depth distribution. These differences also had impact on the downstream variant calling. Conclusively, the products from the AMPLI1 and MALBAC kits were shown to be most similar to the bulk samples and are therefore recommended for WGA of single cells.</p><p>Discussion</p><p>In this study four commercial kits for WGA (AMPLI1, MALBAC, Repli-G and PicoPlex) were used to amplify human single cells. The WGA products were exome sequenced together with non-amplified bulk samples from the same source. The resulting data was evaluated in terms of genomic coverage, allelic dropout and SNP calling.</p></div
For each of the different samples in the 10M read pair subset, mapping rates are homogenous among samples over 90% (a), and exome coverage (b) was considerably lower for the amplified single cells (ranging from 7 to 68%) compared to the bulk samples (~90% coverage).
<p>For each of the different samples in the 10M read pair subset, mapping rates are homogenous among samples over 90% (a), and exome coverage (b) was considerably lower for the amplified single cells (ranging from 7 to 68%) compared to the bulk samples (~90% coverage).</p
Comparison of single-cell approaches and alphabetr.
<p>Single-cell sequencing was simulated by sampling from the same populations used to evaluate alphabetr and including both the dropping of chains and in-frame sequencing errors. In these simulations, the parent population contains 2100 clones with 25 clones representing the top 50% of the clones ranked by abundance. The results were evaluated for (A) top depth, (B) tail depth, and (C) overall depth. The dashed lines show the mean performance of alphabetr applied to five plates using the high-mixed sampling strategy and a threshold of 0.6 (values taken from <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005313#pcbi.1005313.g003" target="_blank">Fig 3</a>). The single-cell sequencing results are averages of 200 simulations.</p
Recovery of tumour-infiltrating lymphocyte TCR pairs using alphabetr and data from ref. [22].
<p>The data were processed by associating chains with their tumour sources through exact matching of the CDR3 nucleotide sequences from the mixed tumour samples to CDR3 libraries obtained from blood samples from each patient. The data were then simplified by selecting only those chains associated with one tumour. We then used alphabetr to identify TCR<i>αβ</i> pairs. The numbers of pairs unambiguously identified in ref. [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005313#pcbi.1005313.ref022" target="_blank">22</a>] were determined by directly matching nucleotide sequences to the CDR3 libraries, and only those pairs for which both chains could be directly associated with the corresponding tumour sample were included in the analysis.</p