69 research outputs found

    Integrative genomic analyses reveal an androgen-driven somatic alteration landscape in early-onset prostate cancer

    No full text
    Early-onset prostate cancer (EO-PCA) represents the earliest clinical manifestation of prostate cancer. To compare the genomic alteration landscapes of EO-PCA with "classical" (elderly-onset) PCA, we performed deep sequencing-based genomics analyses in 11 tumors diagnosed at young age, and pursued comparative assessments with seven elderly-onset PCA genomes. Remarkable age-related differences in structural rearrangement (SR) formation became evident, suggesting distinct disease pathomechanisms. Whereas EO-PCAs harbored a prevalence of balanced SRs, with a specific abundance of androgen-regulated ETS gene fusions including TMPRSS2:ERG, elderly-onset PCAs displayed primarily non-androgen-associated SRs. Data from a validation cohort of > 10,000 patients showed age-dependent androgen receptor levels and a prevalence of SRs affecting androgen-regulated genes, further substantiating the activity of a characteristic "androgen-type" pathomechanism in EO-PCA

    Genomics and drug profiling of fatal TCF3-HLF-positive acute lymphoblastic leukemia identifies recurrent mutation patterns and therapeutic options.

    Get PDF
    TCF3-HLF-positive acute lymphoblastic leukemia (ALL) is currently incurable. Using an integrated approach, we uncovered distinct mutation, gene expression and drug response profiles in TCF3-HLF-positive and treatment-responsive TCF3-PBX1-positive ALL. We identified recurrent intragenic deletions of PAX5 or VPREB1 in constellation with the fusion of TCF3 and HLF. Moreover somatic mutations in the non-translocated allele of TCF3 and a reduction of PAX5 gene dosage in TCF3-HLF ALL suggest cooperation within a restricted genetic context. The enrichment for stem cell and myeloid features in the TCF3-HLF signature may reflect reprogramming by TCF3-HLF of a lymphoid-committed cell of origin toward a hybrid, drug-resistant hematopoietic state. Drug response profiling of matched patient-derived xenografts revealed a distinct profile for TCF3-HLF ALL with resistance to conventional chemotherapeutics but sensitivity to glucocorticoids, anthracyclines and agents in clinical development. Striking on-target sensitivity was achieved with the BCL2-specific inhibitor venetoclax (ABT-199). This integrated approach thus provides alternative treatment options for this deadly disease

    SMARCA2-deficiency confers sensitivity to targeted inhibition of SMARCA4 in esophageal squamous cell carcinoma cell lines

    Get PDF
    SMARCA4/BRG1 and SMARCA2/BRM, the two mutually exclusive catalytic subunits of the BAF complex, display a well-established synthetic lethal relationship in SMARCA4-deficient cancers. Using CRISPR-Cas9 screening, we identify SMARCA4 as a novel dependency in SMARCA2-deficient esophageal squamous cell carcinoma (ESCC) models, reciprocal to the known synthetic lethal interaction. Restoration of SMARCA2 expression alleviates the dependency on SMARCA4, while engineered loss of SMARCA2 renders ESCC models vulnerable to concomitant depletion of SMARCA4. Dependency on SMARCA4 is linked to its ATPase activity, but not to bromodomain function. We highlight the relevance of SMARCA4 as a drug target in esophageal cancer using an engineered ESCC cell model harboring a SMARCA4 allele amenable to targeted proteolysis and identify SMARCA4-dependent cell models with low or absent SMARCA2 expression from additional tumor types. These findings expand the concept of SMARCA2/SMARCA4 paralog dependency and suggest that pharmacological inhibition of SMARCA4 represents a novel therapeutic opportunity for SMARCA2-deficient cancers

    Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity

    Get PDF
    Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95–99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ∼15% and ∼20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing

    Population genetic analysis of bi-allelic structural variants from low-coverage sequence data with an expectation-maximization algorithm

    Get PDF
    Background Population genetics and association studies usually rely on a set of known variable sites that are then genotyped in subsequent samples, because it is easier to genotype than to discover the variation. This is also true for structural variation detected from sequence data. However, the genotypes at known variable sites can only be inferred with uncertainty from low coverage data. Thus, statistical approaches that infer genotype likelihoods, test hypotheses, and estimate population parameters without requiring accurate genotypes are becoming popular. Unfortunately, the current implementations of these methods are intended to analyse only single nucleotide and short indel variation, and they usually assume that the two alleles in a heterozygous individual are sampled with equal probability. This is generally false for structural variants detected with paired ends or split reads. Therefore, the population genetics of structural variants cannot be studied, unless a painstaking and potentially biased genotyping is performed first. Results We present svgem, an expectation-maximization implementation to estimate allele and genotype frequencies, calculate genotype posterior probabilities, and test for Hardy-Weinberg equilibrium and for population differences, from the numbers of times the alleles are observed in each individual. Although applicable to single nucleotide variation, it aims at bi-allelic structural variation of any type, observed by either split reads or paired ends, with arbitrarily high allele sampling bias. We test svgem with simulated and real data from the 1000 Genomes Project. Conclusions svgem makes it possible to use low-coverage sequencing data to study the population distribution of structural variants without having to know their genotypes. Furthermore, this advance allows the combined analysis of structural and nucleotide variation within the same genotype-free statistical framework, thus preventing biases introduced by genotype imputation

    An integrated map of structural variation in 2,504 human genomes

    Get PDF
    Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association. © 2015 Macmillan Publishers Limited. All rights reserved

    Accurate detection of complex structural variations using single-molecule sequencing

    Get PDF
    Structural variations are the greatest source of genetic variation, but they remain poorly understood because of technological limitations. Single-molecule long-read sequencing has the potential to dramatically advance the field, although high error rates are a challenge with existing methods. Addressing this need, we introduce open-source methods for long-read alignment (NGMLR; https://github.com/philres/ngmlr ) and structural variant identification (Sniffles; https://github.com/fritzsedlazeck/Sniffles ) that provide unprecedented sensitivity and precision for variant detection, even in repeat-rich regions and for complex nested events that can have substantial effects on human health. In several long-read datasets, including healthy and cancerous human genomes, we discovered thousands of novel variants and categorized systematic errors in short-read approaches. NGMLR and Sniffles can automatically filter false events and operate on low-coverage data, thereby reducing the high costs that have hindered the application of long reads in clinical and research settings
    corecore