9 research outputs found
Summary of data used in these analyses.
<p>Number of reads reflects the number of aligning reads after removing duplicate read pairs and filtering for low quality alignments (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0023683#s4" target="_blank">Methods</a>). Gender was determined by looking at coverage of reads in specific representative regions of the X and Y chromosomes (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0023683#s4" target="_blank">Methods</a>). Number of genotypes called is from the autosomes only, which is what was used for downstream comparisons.</p
Concordance between lanes.
<p>Distributions of genotype concordance rates from same- and different-sample comparisons are non-overlapping. The box plot in (A) shows the distributions of concordance rates when using all callable positions for all combinations of pairs of the three samples being analyzed. The x-axis denotes each pair being compared (A, B, and C, refer to the sample IDs in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0023683#pone-0023683-t001" target="_blank">Table 1</a>), and the y-axis represents the distribution of concordance rates for all pair-wise combinations of lanes representing the specific pair of samples on the x-axis. It is likely that the detected differences from same-sample comparisons (BâB, CâC, and AâA) arise solely from sequencing and genotyping error. The box plot in (B) is similar to (A), except that here only variant (nonreference) positions are considered. The symmetrical heat map in (C) summarizes the data from panel (A); the blue boxes represent low concordance rates and correspond to different-sample comparisons, while the yellow boxes along the diagonal represent high concordance rates and correspond to same-sample comparisons. Note that comparisons between samples B and C (gray boxes) are slightly more similar to each other than the other different-sample comparisons, but still sufficiently distinct from same-sample comparisons. This is expected given the known partial consanguinity between these individuals.</p
Overview of approach.
<p>Several lanes of HiSeq 2000 data are typically combined together for a comprehensive genome analysis, giving a high depth of coverage (A), and the ability to accurately call genotypes in the majority of the genome. In (B), two individual lanes of HiSeq 2000 data are depicted, with a lower average depth of coverage. By chance, some regions of the genome have enough data to be genotyped in both lanes (shaded gray).</p
Effect of data quantity on concordance rates.
<p>The total number of reads used in the analysis affects different-sample comparisons, but not same-sample comparisons. In (A), lane 7 of sample ID B was kept constant at 140 million reads (B7), and the amount of data for the other sample [either lane 8 of B (B8) or lane 7 of C (C7)] was varied between 40 million and 140 million reads (x-axis) in 20 million read increments. The y-axis represents the concordance rate between variant (nonreference) genotypes called between the two different datasets. Note that for the same-sample comparison (red line), varying the number of reads used in the analysis does not substantially alter the concordance rate. However, this is not the case for different-sample comparisons (blue line), where the concordance rate becomes more different as more reads are used. In (B), a similar trend is observed when the reads in both samples are incremented simultaneously. Solid lines represent a LOESS smoothed fit to the data points.</p
Mutational Signatures of De-Differentiation in Functional Non-Coding Regions of Melanoma Genomes
<div><p>Much emphasis has been placed on the identification, functional characterization, and therapeutic potential of somatic variants in tumor genomes. However, the majority of somatic variants lie outside coding regions and their role in cancer progression remains to be determined. In order to establish a system to test the functional importance of non-coding somatic variants in cancer, we created a low-passage cell culture of a metastatic melanoma tumor sample. As a foundation for interpreting functional assays, we performed whole-genome sequencing and analysis of this cell culture, the metastatic tumor from which it was derived, and the patient-matched normal genomes. When comparing somatic mutations identified in the cell culture and tissue genomes, we observe concordance at the majority of single nucleotide variants, whereas copy number changes are more variable. To understand the functional impact of non-coding somatic variation, we leveraged functional data generated by the ENCODE Project Consortium. We analyzed regulatory regions derived from multiple different cell types and found that melanocyte-specific regions are among the most depleted for somatic mutation accumulation. Significant depletion in other cell types suggests the metastatic melanoma cells de-differentiated to a more basal regulatory state. Experimental identification of genome-wide regulatory sites in two different melanoma samples supports this observation. Together, these results show that mutation accumulation in metastatic melanoma is nonrandom across the genome and that a de-differentiated regulatory architecture is common among different samples. Our findings enable identification of the underlying genetic components of melanoma and define the differences between a tissue-derived tumor sample and the cell culture created from it. Such information helps establish a broader mechanistic understanding of the linkage between non-coding genomic variations and the cellular evolution of cancer.</p> </div
The regulatory signature of metastatic melanoma.
<p>Genome-wide DNase-Seq identifies (DHS) regulatory elements in the cell culture sample from our study and the colo-829 cell line. (A) Hierarchical clustering of all DHSs shows that the regulatory architecture of metastatic melanoma cells (red) adopts that of a more derived melanocyte (blue). (B) Focusing on exon-overlapping DHSs to identify the open chromatin landscape in gene regions shows that the metastatic melanoma cells are de-differentiated relative to melanocytes. Of the DHSs that occur in exonic regions and are specific to the metastatic melanoma samples (and not present in any others), the important melanoma genes MITF, NEDD9, and DCC are identified. (C) Melanoma transcription at melanoma-specific (dark blue) TSS-distal DHSs is significantly more frequent (P<2.2eâ16; Fisher's Exact Test) than at melanocyte-specific (light blue) TSS-distal DHSs. (D) Mutational bias in melanoma DHSs is asymmetric with respect to orientation relative to the transcribed strand. The 12 possible mutations are collapsed into 6 such that the key mutation (A>C, for example; blue) and its complement (T>G; yellow) version are represented with different colors. An asterisk (*) represents P<0.05 for a Binomial test, using a 50% expectation, on the counts for a pair of key and complement mutations.</p
Melanoma tissue and cell culture similarities.
<p>(A) The experimental design of our study. Concordance between the somatic calls in the tissue (blue) and cell culture (yellow) for SSNVs (B), CNV amplifications (C), and CNV deletions (D). The two samples are highly concordant at the SNV level, but more different at the CNV level.</p
Somatic mutation accumulation is non-random across the genome.
<p>(A) Somatic (blue) and common (gray) variants have different levels of enrichment or depletion depending on which chromatin segmentation they occur in. Somatic mutation accumulation is highly anti-correlated with evolutionary constraint (B) and coding fraction (C).</p
Non-coding Melanocyte DHSs are dis-enriched for accumulating melanoma somatic mutations.
<p>(A) Genic partitioning of melanocyte DHSs such that every DHS occurs in a single category shows that most categories are depleted for mutation accumulation (TSS Pâ=âTranscription Start Site Proximal [within 5 Kb]; TSS Dâ=âTranscription Start Site Distal [greater than 5 Kb]). Common SNPs are based on 1000 Genomes calls that have at least 5% minor allele frequency (MAF). (B) Intergenic TSS-distal cell-type-specific and ubiquitous DHSs show different levels of enrichment or depletion. (C) Enrichment or depletion at cell-type combinations of intergenic TSS-distal melanocyte and non-melanocyte DHSs. For these analyses, the set of regions representing any data point must have overlapped at least 10 somatic variants to be considered. The horizontal black line at zero represents no enrichment. The GSC method was used to measure enrichment. Error bars represent one standard deviation from the mean of the null distribution. (D) A hierarchical tree based on DHS Euclidean distance among 29 different cell states. Note the positioning of melanocytes âMelanoâ relative to aortic smooth muscle cells âAosmcSerumfreeâ and human embryonic stem cells âH1hescâ, which are among the most depleted for somatic mutation accumulation (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002871#pgen-1002871-g003" target="_blank">Figure 3B</a>).</p