9 research outputs found

    On the evolution of cancer genomes : Signatures of selection reveal cancer genes across multiple tumor types

    No full text
    Tumors are composed of fast-growing cells that become malignant under selection of biological functions needed for cancer development. In this thesis, I intend to uncover the basic evolutionary principles underlying cancer etiology. The first part constitutes a longitudinal analysis of a single CLL case, which tumor heterogeneity and clonal evolution were revealed by sequencing. The second explores the signatures of positive selection of somatic mutations allowing the identification of driver genes. The last part is an attempt to uncover the essential functions of the cancer cell using signals of purifying selection. Altogether, we have identified a landscape of cancer-related genes that can be used for improving current cancer treatments.El tumor esta compuesto de células que crecen indiscriminadamente, bajo la lupa de selección natural. En esta tesis hemos intentado reconstruir los principios básicos de la evolución del cáncer, como estos describen la adquisición de mutaciones que inician la malignidad tumoral. El primer trabajo es un anaálisis genómico de un paciente con leucemia. El Segundo explora la heterogeneidad intratumoral para identificar genes drivers del cáncer. Y el último trabajo se enfoca en desenmascarar las señales de selección negativa. Nuestros resultados de estos tres trabajos constituyen una fuente de nuevos genes que pueden ser explorados como dianas terapéuticas del cáncer

    Jitterbug: somatic and germline transposon insertion detection at single-nucleotide resolution

    No full text
    Background: Transposable elements are major players in genome evolution. Transposon insertion polymorphisms can/ntranslate into phenotypic differences in plants and animals and are linked to different diseases including human cancer, making their characterization highly relevant to the study of genome evolution and genetic diseases./nResults: Here we present Jitterbug, a novel tool that identifies transposable element insertion sites at single-nucleotide resolution based on the pairedend mapping and clipped-read signatures produced by NGS alignments. Jitterbug can be easily integrated into existing NGS analysis pipelines, using the standard BAM format produced by frequently applied alignment tools (e.g. bwa, bowtie2), with no need to realign reads to a set of consensus transposon sequences. Jitterbug is highly sensitive and able to recall transposon insertions with a very high specificity, as demonstrated by benchmarks in the human and Arabidopsis genomes, and validation using long PacBio reads. In addition, Jitterbug estimates the zygosity of transposon insertions with high accuracy and can also identify somatic insertions. Conclusions: We demonstrate that Jitterbug can identify mosaic somatic transposon movement using sequenced tumor-normal sample pairs and allows for estimating the cancer cell fraction of clones containing a somatic TE insertion. We suggest that the independent methods we use to evaluate performance are a step towards creating a gold standard dataset for benchmarking structural variant prediction tools

    Signatures of positive selection reveal a universal role of chromatin modifiers as cancer driver genes

    No full text
    Tumors are composed of an evolving population of cells subjected to tissue-specific selection, which fuels tumor heterogeneity and ultimately complicates cancer driver gene identification. Here, we integrate cancer cell fraction, population recurrence, and functional impact of somatic mutations as signatures of selection into a Bayesian model for driver prediction. We demonstrate that our model, cDriver, outperforms competing methods when analyzing solid tumors, hematological malignancies, and pan-cancer datasets. Applying cDriver to exome sequencing data of 21 cancer types from 6,870 individuals revealed 98 unreported tumor type-driver gene connections. These novel connections are highly enriched for chromatin-modifying proteins, hinting at a universal role of chromatin regulation in cancer etiology. Although infrequently mutated as single genes, we show that chromatin modifiers are altered in a large fraction of cancer patients. In summary, we demonstrate that integration of evolutionary signatures is key for identifying mutational driver genes, thereby facilitating the discovery of novel therapeutic targets for cancer treatment.We acknowledge support of the Spanish Ministry of Economy and Competitiveness, 'Centro de Excelencia Severo Ochoa 2013-2017'. We acknowledge the support of the CERCA Programme/Generalitat de Catalunya. This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 635290. Luis Zapata has been supported by the International PhD scholarship program of La Caixa at CRG

    Negative selection in tumor genome evolution acts on essential cellular functions and the immunopeptidome

    No full text
    Background: Natural selection shapes cancer genomes. Previous studies used signatures of positive selection to identify genes driving malignant transformation. However, the contribution of negative selection against somatic mutations that affect essential tumor functions or specific domains remains a controversial topic. Results: Here, we analyze 7546 individual exomes from 26 tumor types from TCGA data to explore the portion of the cancer exome under negative selection. Although we find most of the genes neutrally evolving in a pan-cancer framework, we identify essential cancer genes and immune-exposed protein regions under significant negative selection. Moreover, our simulations suggest that the amount of negative selection is underestimated. We therefore choose an empirical approach to identify genes, functions, and protein regions under negative selection. We find that expression and mutation status of negatively selected genes is indicative of patient survival. Processes that are most strongly conserved are those that play fundamental cellular roles such as protein synthesis, glucose metabolism, and molecular transport. Intriguingly, we observe strong signals of selection in the immunopeptidome and proteins controlling peptide exposition, highlighting the importance of immune surveillance evasion. Additionally, tumor type-specific immune activity correlates with the strength of negative selection on human epitopes. Conclusions: In summary, our results show that negative selection is a hallmark of cell essentiality and immune response in cancer. The functional domains identified could be exploited therapeutically, ultimately allowing for the development of novel cancer treatments.The research leading to these results received funding from the Spanish Ministry of Economy—, Industry and Competitiveness (Plan Nacional BIO2012-39754, BFU2012-31329 and BFU2015-68723-P and to the EMBL partnership), “Centro de Excelencia Severo Ochoa 2013–2017,” SEV-2012–0208, the European Union Seventh Framework Programme (FP7/2007–2013) under grant agreement nº. HEALTH-F4-2011–278568 (PRIMES), the European Fund for Regional Development (EFRD), European Union’s Horizon 2020 research and innovation programme under grant agreement Nº 635290 (PanCanRisk), CERCA Programme / Generalitat de Catalunya, the HHMI International Early Career Scientist Program (55007424), Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement de la Generalitat’s AGAUR program (2014 SGR 0974), and the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013, ERC grant agreement 335980_EinME). LZ has been supported by the International PhD scholarship program of La Caixa at CRG and MS by the German Research Foundation (SCHA 1933/1-1)

    Analysis of a long-term outbreak of XDR Pseudomonas aeruginosa: a molecular epidemiological study

    No full text
    OBJECTIVES: Here we report on a long-term outbreak from 2009 to 2012 with an XDR Pseudomonas aeruginosa on two wards at a university hospital in southern Germany. METHODS: Whole-genome sequencing was performed on the outbreak isolates and a core genome was constructed for molecular epidemiological analysis. We applied a time-place-sequence algorithm to improve estimation of transmission probabilities. RESULTS: By using conventional infection control methods we identified 49 P. aeruginosa strains, including eight environmental isolates that belonged to ST308 (by MLST) and carried the metallo-β-lactamase IMP-8. Phylogenetic analysis on the basis of a non-recombinant core genome that contained 22 outbreak-specific SNPs revealed a pattern of four dominant clades with a strong phylogeographic structure and allowed us to determine the potential temporal origin of the outbreak to July 2008, 1 year before the index case was diagnosed. Superspreaders at the root of clades exhibited a high number of probable and predicted transmissions, indicating their exceptional position in the outbreak. CONCLUSIONS: Our results suggest that the initial expansion of dominant sublineages was driven by a few superspreaders, while environmental contamination seemed to sustain the outbreak for a long period despite regular environmental control measures.This work was supported by a grant from ‘la Caixa’ to L.Z

    Allele balance bias identifies systematic genotyping errors and false disease associations

    No full text
    In recent years, next-generation sequencing (NGS) has become a cornerstone of clinical genetics and diagnostics. Many clinical applications require high precision, especially if rare events such as somatic mutations in cancer or genetic variants causing rare diseases need to be identified. Although random sequencing errors can be modeled statistically and deep sequencing minimizes their impact, systematic errors remain a problem even at high depth of coverage. Understanding their source is crucial to increase precision of clinical NGS applications. In this work, we studied the relation between recurrent biases in allele balance (AB), systematic errors, and false positive variant calls across a large cohort of human samples analyzed by whole exome sequencing (WES). We have modeled the AB distribution for biallelic genotypes in 987 WES samples in order to identify positions recurrently deviating significantly from the expectation, a phenomenon we termed allele balance bias (ABB). Furthermore, we have developed a genotype callability score based on ABB for all positions of the human exome, which detects false positive variant calls that passed state-of-the-art filters. Finally, we demonstrate the use of ABB for detection of false associations proposed by rare variant association studies. Availability: https://github.com/Francesc-Muyas/ABB.Spanish Ministry of Economy and Competitiveness, Grant/Award Number: ‘Centro de Excelencia Severo Ochoa 2017–2021’; CERCA Programme/Generalitat de Catalunya; The “la Caixa” Foundation; CRG Emergent Translational Research Award; European Union's H2020 Research and Innovation Programme, Grant/Award Number: 635290 (PanCanRisk); MINECO Severo Ochoa Fellowship, Grant/Award Number: SVP‐2013‐0680066; PERIS Program, Grant/Award Number: SLT002/16/00310

    Chromosome-level assembly of Arabidopsis thaliana Ler reveals the extent of translocation and inversion polymorphisms

    No full text
    Resequencing or reference-based assemblies reveal large parts of the small-scale sequence variation. However, they typically fail to separate such local variation into colinear and rearranged variation, because they usually do not recover the complement of large-scale rearrangements, including transpositions and inversions. Besides the availability of hundreds of genomes of diverse Arabidopsis thaliana accessions, there is so far only one full-length assembled genome: the reference sequence. We have assembled 117 Mb of the A. thaliana Landsberg erecta (Ler) genome into five chromosome-equivalent sequences using a combination of short Illumina reads, long PacBio reads, and linkage information. Whole-genome comparison against the reference sequence revealed 564 transpositions and 47 inversions comprising ∼3.6 Mb, in addition to 4.1 Mb of nonreference sequence, mostly originating from duplications. Although rearranged regions are not different in local divergence from colinear regions, they are drastically depleted for meiotic recombination in heterozygotes. Using a 1.2-Mb inversion as an example, we show that such rearrangement-mediated reduction of meiotic recombination can lead to genetically isolated haplotypes in the worldwide population of A. thaliana Moreover, we found 105 single-copy genes, which were only present in the reference sequence or the Ler assembly, and 334 single-copy orthologs, which showed an additional copy in only one of the genomes. To our knowledge, this work gives first insights into the degree and type of variation, which will be revealed once complete assemblies will replace resequencing or other reference-dependent methods.Knowledge of the exact distribution of meiotic crossovers (COs) and gene conversions (GCs) is essential for understanding many aspects of population genetics and evolution, from haplotype structure and long-distance genetic linkage to the generation of new allelic variants of genes. To this end, we resequenced the four products of 13 meiotic tetrads along with 10 doubled haploids derived from Arabidopsis thaliana hybrids. GC detection through short reads has previously been confounded by genomic rearrangements. Rigid filtering for misaligned reads allowed GC identification at high accuracy and revealed an ∼80-kb transposition, which undergoes copy-number changes mediated by meiotic recombination. Non-crossover associated GCs were extremely rare most likely due to their short average length of ∼25-50 bp, which is significantly shorter than the length of CO-associated GCs. Overall, recombination preferentially targeted non-methylated nucleosome-free regions at gene promoters, which showed significant enrichment of two sequence motifs.This work was supported by Spanish Ministry of Economy and Competitiveness Centro de Excelencia Severo Ochoa 2013-2017 Grant SEV-2012-0208. L.Z. was supported by the International PhD scholarship program of La Caixa at CR

    Identification of gene mutations and fusion genes in patients with Sézary syndrome.

    No full text
    Sézary syndrome is a leukemic form of cutaneous T-cell lymphoma with an aggressive clinical course. The genetic etiology of the disease is poorly understood, with chromosomal abnormalities and mutations in some genes being involved in the disease. The goal of our study was to understand the genetic basis of the disease by looking for driver gene mutations and fusion genes in 15 erythrodermic patients with circulating Sézary cells, 14 of them fulfilling the diagnostic criteria of Sézary syndrome. We have discovered genes that could be involved in the pathogenesis of Sézary syndrome. Some of the genes that are affected by somatic point mutations include ITPR1, ITPR2, DSC1, RIPK2, IL6, and RAG2, with some of them mutated in more than one patient. We observed several somatic copy number variations shared between patients, including deletions and duplications of large segments of chromosome 17. Genes with potential function in the T-cell receptor signaling pathway and tumorigenesis were disrupted in Sézary syndrome patients, for example, CBLB, RASA2, BCL7C, RAMP3, TBRG4, and DAD1. Furthermore, we discovered several fusion events of interest involving RASA2, NFKB2, BCR, FASN, ZEB1, TYK2, and SGMS1. Our work has implications for the development of potential therapeutic approaches for this aggressive disease.This project was funded by “Retos de la Sociedad 2013: Europa Redes y Gestores” Programme from the Spanish Ministry of Economy and Competitiveness no. SAF2013-49108-R (to XE) and RD12/0036/0044 Red Temática de Investigación Cooperativa en Cancer, Fondo Europeo de Desarrollo Regional (to BE, FG, and RP), the Generalitat de Catalunya AGAUR 2014 SGR-1138 (to XE) and 2014 SGR-585 (to BE, A Puiggros, and FG), the European Commission 7th Framework Program (FP7/2007-2013) under grant agreement 282510 (A BLUEPRINT of haematopoietic Epigenomes to XE) and 262055 (European Sequencing and Genotyping Infrastructure to XE), Instituto de Salud Carlos III FEDER (PT13/0010/0005), and the “Xarxa de Bancs de tumors sponsored by Pla Director d’Oncologia de Catalunya.” We would also like to thank “Xarxa de Limfomes Cutanis de Catalunaya.” A Prasad is a Marie Curie Postdoctoral fellow supported by the European Commission 7th framework program (FP7/2007-2013) under grant agreement no. 625356. We acknowledge the support of the Spanish Ministry of Economy and Competitiveness, Centro de Excelencia Severo Ochoa 2013-2017, SEV-2012-0208
    corecore