389 research outputs found
Allele-Specific Amplification in Cancer Revealed by SNP Array Analysis
Amplification, deletion, and loss of heterozygosity of genomic DNA are hallmarks of cancer. In recent years a variety of studies have emerged measuring total chromosomal copy number at increasingly high resolution. Similarly, loss-of-heterozygosity events have been finely mapped using high-throughput genotyping technologies. We have developed a probe-level allele-specific quantitation procedure that extracts both copy number and allelotype information from single nucleotide polymorphism (SNP) array data to arrive at allele-specific copy number across the genome. Our approach applies an expectation-maximization algorithm to a model derived from a novel classification of SNP array probes. This method is the first to our knowledge that is able to (a) determine the generalized genotype of aberrant samples at each SNP site (e.g., CCCCT at an amplified site), and (b) infer the copy number of each parental chromosome across the genome. With this method, we are able to determine not just where amplifications and deletions occur, but also the haplotype of the region being amplified or deleted. The merit of our model and general approach is demonstrated by very precise genotyping of normal samples, and our allele-specific copy number inferences are validated using PCR experiments. Applying our method to a collection of lung cancer samples, we are able to conclude that amplification is essentially monoallelic, as would be expected under the mechanisms currently believed responsible for gene amplification. This suggests that a specific parental chromosome may be targeted for amplification, whether because of germ line or somatic variation. An R software package containing the methods described in this paper is freely available at http://genome.dfci.harvard.edu/~tlaframb/PLASQ
Recommended from our members
SNP panel identification assay (SPIA): a genetic-based assay for the identification of cell lines
Translational research hinges on the ability to make observations in model systems and to implement those findings into clinical applications, such as the development of diagnostic tools or targeted therapeutics. Tumor cell lines are commonly used to model carcinogenesis. The same tumor cell line can be simultaneously studied in multiple research laboratories throughout the world, theoretically generating results that are directly comparable. One important assumption in this paradigm is that researchers are working with the same cells. However, recent work using high throughput genomic analyses questions the accuracy of this assumption. Observations by our group and others suggest that experiments reported in the scientific literature may contain pre-analytic errors due to inaccurate identities of the cell lines employed. To address this problem, we developed a simple approach that enables an accurate determination of cell line identity by genotyping 34 single nucleotide polymorphisms (SNPs). Here, we describe the empirical development of a SNP panel identification assay (SPIA) compatible with routine use in the laboratory setting to ensure the identity of tumor cell lines and human tumor samples throughout the course of long term research use
Somatic rearrangements across cancer reveal classes of samples with distinct patterns of DNA breakage and rearrangement-induced hypermutability
Whole-genome sequencing using massively parallel sequencing technologies enables accurate detection of somatic rearrangements in cancer. Pinpointing large numbers of rearrangement breakpoints to base-pair resolution allows analysis of rearrangement microhomology and genomic location for every sample. Here we analyze 95 tumor genome sequences from breast, head and neck, colorectal, and prostate carcinomas, and from melanoma, multiple myeloma, and chronic lymphocytic leukemia. We discover three genomic factors that are significantly correlated with the distribution of rearrangements: replication time, transcription rate, and GC content. The correlation is complex, and different patterns are observed between tumor types, within tumor types, and even between different types of rearrangements. Mutations in the APC gene correlate with and, hence, potentially contribute to DNA breakage in late-replicating, low %GC, untranscribed regions of the genome. We show that somatic rearrangements display less microhomology than germline rearrangements, and that breakpoint loci are correlated with local hypermutability with a particular enrichment for C ↔ G transversions
Prioritizing causal disease genes using unbiased genomic features
Background: Cardiovascular disease (CVD) is the leading cause of death in the developed world. Human genetic studies, including genome-wide sequencing and SNP-array approaches, promise to reveal disease genes and mechanisms representing new therapeutic targets. In practice, however, identification of the actual genes contributing to disease pathogenesis has lagged behind identification of associated loci, thus limiting the clinical benefits. Results: To aid in localizing causal genes, we develop a machine learning approach, Objective Prioritization for Enhanced Novelty (OPEN), which quantitatively prioritizes gene-disease associations based on a diverse group of genomic features. This approach uses only unbiased predictive features and thus is not hampered by a preference towards previously well-characterized genes. We demonstrate success in identifying genetic determinants for CVD-related traits, including cholesterol levels, blood pressure, and conduction system and cardiomyopathy phenotypes. Using OPEN, we prioritize genes, including FLNC, for association with increased left ventricular diameter, which is a defining feature of a prevalent cardiovascular disorder, dilated cardiomyopathy or DCM. Using a zebrafish model, we experimentally validate FLNC and identify a novel FLNC splice-site mutation in a patient with severe DCM. Conclusion: Our approach stands to assist interpretation of large-scale genetic studies without compromising their fundamentally unbiased nature. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0534-8) contains supplementary material, which is available to authorized users
Recommended from our members
PAK1 is a Breast Cancer Oncogene that Coordinately Activates MAPK and MET Signaling
Activating mutations in the RAS family or BRAF frequently occur in many types of human cancers but are rarely detected in breast tumors. However, activation of the RAS-RAF-MEK-ERK Mitogen-Activated Protein Kinase (MAPK) pathway is commonly observed in human breast cancers, suggesting that other genetic alterations lead to activation of this signaling pathway. To identify breast cancer oncogenes that activate the MAPK pathway, we screened a library of human kinases for their ability to induce anchorage-independent growth in a derivative of immortalized human mammary epithelial cells (HMLE). We identified PAK1 as a kinase that permitted HMLE cells to form anchorage-independent colonies. PAK1 is amplified in several human cancer types, including 33% of breast tumor samples and cancer cell lines. The kinase activity of PAK1 is necessary for PAK1-induced transformation. Moreover, we show that PAK1 simultaneously activates MAPK and MET signaling; the latter via inhibition of Merlin. Disruption of these activities inhibits PAK1-driven anchorage-independent growth. These observations establish PAK1 amplification as an alternative mechanism for MAPK activation in human breast cancer and credential PAK1 as a breast cancer oncogene that coordinately regulates multiple signaling pathways, the cooperation of which leads to malignant transformation
Inferring Loss-of-Heterozygosity from Unpaired Tumors Using High-Density Oligonucleotide SNP Arrays
Loss of heterozygosity (LOH) of chromosomal regions bearing tumor suppressors is a key event in the evolution of epithelial and mesenchymal tumors. Identification of these regions usually relies on genotyping tumor and counterpart normal DNA and noting regions where heterozygous alleles in the normal DNA become homozygous in the tumor. However, paired normal samples for tumors and cell lines are often not available. With the advent of oligonucleotide arrays that simultaneously assay thousands of single-nucleotide polymorphism (SNP) markers, genotyping can now be done at high enough resolution to allow identification of LOH events by the absence of heterozygous loci, without comparison to normal controls. Here we describe a hidden Markov model-based method to identify LOH from unpaired tumor samples, taking into account SNP intermarker distances, SNP-specific heterozygosity rates, and the haplotype structure of the human genome. When we applied the method to data genotyped on 100 K arrays, we correctly identified 99% of SNP markers as either retention or loss. We also correctly identified 81% of the regions of LOH, including 98% of regions greater than 3 megabases. By integrating copy number analysis into the method, we were able to distinguish LOH from allelic imbalance. Application of this method to data from a set of prostate samples without paired normals identified known regions of prevalent LOH. We have developed a method for analyzing high-density oligonucleotide SNP array data to accurately identify of regions of LOH and retention in tumors without the need for paired normal samples
- …