69 research outputs found
Translocation and deletion breakpoints in cancer genomes are associated with potential non-B DNA-forming sequences
Gross chromosomal rearrangements (including translocations, deletions, insertions and duplications) are a hallmark of cancer genomes and often create oncogenic fusion genes. An obligate step in the generation of such gross rearrangements is the formation of DNA double-strand breaks (DSBs). Since the genomic distribution of rearrangement breakpoints is non-random, intrinsic cellular factors may predispose certain genomic regions to breakage. Notably, certain DNA sequences with the potential to fold into secondary structures [potential non-B DNA structures (PONDS); e.g. triplexes, quadruplexes, hairpin/cruciforms, Z-DNA and single-stranded looped-out structures with implications in DNA replication and transcription] can stimulate the formation of DNA DSBs. Here, we tested the postulate that these DNA sequences might be found at, or in close proximity to, rearrangement breakpoints. By analyzing the distribution of PONDS-forming sequences within ±500 bases of 19 947 translocation and 46 365 sequence-characterized deletion breakpoints in cancer genomes, we find significant association between PONDS-forming repeats and cancer breakpoints. Specifically, (AT)n, (GAA)n and (GAAA)n constitute the most frequent repeats at translocation breakpoints, whereas A-tracts occur preferentially at deletion breakpoints. Translocation breakpoints near PONDS-forming repeats also recur in different individuals and patient tumor samples. Hence, PONDS-forming sequences represent an intrinsic risk factor for genomic rearrangements in cancer genomes
Recommended from our members
Using Supercomputing Resources in Genomic Research
TACC resources have proven to be critical and enabling to mine cancer genomic data, genomic variants associated with human disease and polymorphic human traits, addressing biological questions otherwise non-approachable by conventional experiments. We have developed computational scripts that we use in a parallel environment to harness the capabilities of TACC HPCs, and which we have made publicly available on GitHub. In selected peer-review publications acknowledging TACC support, we have reported the association of DNA sequences able to form alternative DNA structures (or non-B DNA) with sites of chromosomal breaks leading to gross chromosomal translocations in cancer genomes, with sites of gene duplication predisposing to Parkinson’s disease, and most recently with regions of increased polymorphism in the human population. We found an exquisite correlation between the expression of selected genes and the mutational burden in cancer patients. While solving the crystal structure of a poorly characterized exonuclease, named EXO5, TACC resources enabled the assignment of a role for EXO5 in the cellular response to DNA damage, a vital pathway used by tumors to survive and grow, along with key genes whose high expression is linked to poor survival in cancer patients. Most recently, during the discovery of a nuclear role for GRB2, an adaptor protein previously thought to act only in the cytoplasm, TACC resources enabled us to test hypotheses derived from laboratory data. We were gratified to confirm the laboratory prediction that high expression of GRB2, together with its binding partner the MRE11 nuclease, carries accurate prognostic power for poor patient survival in breast cancer patients proficient in DNA homology-directed repair. These composite findings, significantly facilitated by TACC resources, have been critical to further our understanding in biological processes relevant to human disease, and to provide knowledge for the development of more precise therapeutic tools aimed at improving human health
Distinct sequence features underlie microdeletions and gross deletions in the human genome
Microdeletions and gross deletions are important causes (~20%) of human inherited disease and their genomic locations are strongly influenced by the local DNA sequence environment. This notwithstanding, no study has systematically examined their underlying generative mechanisms. Here, we obtained 42,098 pathogenic microdeletions and gross deletions from the Human Gene Mutation Database (HGMD) that together form a continuum of germline deletions ranging in size from 1bp to 28,394,429bp. We analyzed the DNA sequence within 1-kb of the breakpoint junctions and found that the frequencies of non-B DNA-forming repeats, GC-content, and the presence of seven of 78 specific sequence motifs in the vicinity of pathogenic deletions correlated with deletion length for deletions of length ≤30 bp. Further, we found that the presence of DR, GQ and STR repeats is important for the formation of longer deletions (>30 bp) but not for the formation of shorter deletions (≤30 bp) whilst significantly (Chi-square test P-value30 bp). We provide evidence to support a functional distinction between microdeletions and gross deletions. Finally, we propose that a deletion length cut-off of 25-30bp may serve as an objective means to functionally distinguish microdeletions from gross deletions
Recommended from our members
A functional link between lariat debranching enzyme and the intron-binding complex is defective in non-photosensitive trichothiodystrophy.
The pre-mRNA life cycle requires intron processing; yet, how intron-processing defects influence splicing and gene expression is unclear. Here, we find that TTDN1/MPLKIP, which is encoded by a gene implicated in non-photosensitive trichothiodystrophy (NP-TTD), functionally links intron lariat processing to spliceosomal function. The conserved TTDN1 C-terminal region directly binds lariat debranching enzyme DBR1, whereas its N-terminal intrinsically disordered region (IDR) binds the intron-binding complex (IBC). TTDN1 loss, or a mutated IDR, causes significant intron lariat accumulation, as well as splicing and gene expression defects, mirroring phenotypes observed in NP-TTD patient cells. A Ttdn1-deficient mouse model recapitulates intron-processing defects and certain neurodevelopmental phenotypes seen in NP-TTD. Fusing DBR1 to the TTDN1 IDR is sufficient to recruit DBR1 to the IBC and circumvents the functional requirement for TTDN1. Collectively, our findings link RNA lariat processing with splicing outcomes by revealing the molecular function of TTDN1
Heritable pattern of oxidized DNA base repair coincides with pre-targeting of repair complexes to open chromatin
Human genome stability requires efficient repair of oxidized bases, which is initiated via damage recognition and excision by NEIL1 and other base excision repair (BER) pathway DNA glycosylases (DGs). However, the biological mechanisms underlying detection of damaged bases among the million-fold excess of undamaged bases remain enigmatic. Indeed, mutation rates vary greatly within individual genomes, and lesion recognition by purified DGs in the chromatin context is inefficient. Employing super-resolution microscopy and co-immunoprecipitation assays, we find that acetylated NEIL1 (AcNEIL1), but not its non-acetylated form, is predominantly localized in the nucleus in association with epigenetic marks of uncondensed chromatin. Furthermore, chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) revealed non-random AcNEIL1 binding near transcription start sites of weakly transcribed genes and along highly transcribed chromatin domains. Bioinformatic analyses revealed a striking correspondence between AcNEIL1 occupancy along the genome and mutation rates, with AcNEIL1-occupied sites exhibiting fewer mutations compared to AcNEIL1-free domains, both in cancer genomes and in population variation. Intriguingly, from the evolutionarily conserved unstructured domain that targets NEIL1 to open chromatin, its damage surveillance of highly oxidation-susceptible sites to preserve essential gene function and to limit instability and cancer likely originated ∼500 million years ago during the buildup of free atmospheric oxygen
The somatic autosomal mutation matrix in cancer genomes
DNA damage in somatic cells originates from both environmental and endogenous sources, giving rise to mutations through multiple mechanisms. When these mutations affect the function of critical genes, cancer may ensue. Although identifying genomic subsets of mutated genes may inform therapeutic options, a systematic survey of tumor mutational spectra is required to improve our understanding of the underlying mechanisms of mutagenesis involved in cancer etiology. Recent studies have presented genome-wide sets of somatic mutations as a 96-element vector, a procedure that only captures the immediate neighbors of the mutated nucleotide. Herein, we present a 32 × 12 mutation matrix that captures the nucleotide pattern two nucleotides upstream and downstream of the mutation. A somatic autosomal mutation matrix (SAMM) was constructed from tumor-specific mutations derived from each of 909 individual cancer genomes harboring a total of 10,681,843 single-base substitutions. In addition, mechanistic template mutation matrices (MTMMs) representing oxidative DNA damage, ultraviolet-induced DNA damage, 5mCpG deamination, and APOBEC-mediated cytosine mutation, are presented. MTMMs were mapped to the individual tumor SAMMs to determine the maximum contribution of each mutational mechanism to the overall mutation pattern. A Manhattan distance across all SAMM elements between any two tumor genomes was used to determine their relative distance. Employing this metric, 89.5 % of all tumor genomes were found to have a nearest neighbor from the same tissue of origin. When a distance-dependent 6-nearest neighbor classifier was used, 86.9 % of all SAMMs were assigned to the correct tissue of origin. Thus, although tumors from different tissues may have similar mutation patterns, their SAMMs often display signatures that are characteristic of specific tissues
Guanine Holes Are Prominent Targets for Mutation in Cancer and Inherited Disease
Albino Bacolla, Guliang Wang, Aklank Jain, Karen M. Vasquez, Division of Pharmacology and Toxicology, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, Texas, United States of AmericaAlbino Bacolla, Nuri A. Temiz, Ming Yi, Joseph Ivanic, Regina Z. Cer, Duncan E. Donohue, Uma S. Mudunuri, Natalia Volfovsky, Brian T. Luke, Robert M., Stephens, Jack R. Collins, Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of AmericaEdward V. Ball, David N. Cooper, Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United KingdomSingle base substitutions constitute the most frequent type of human gene mutation and are a leading cause of cancer and inherited disease. These alterations occur non-randomly in DNA, being strongly influenced by the local nucleotide sequence context. However, the molecular mechanisms underlying such sequence context-dependent mutagenesis are not fully understood. Using bioinformatics, computational and molecular modeling analyses, we have determined the frequencies of mutation at G•C bp in the context of all 64 5′-NGNN-3′ motifs that contain the mutation at the second position. Twenty-four datasets were employed, comprising >530,000 somatic single base substitutions from 21 cancer genomes, >77,000 germline single-base substitutions causing or associated with human inherited disease and 16.7 million benign germline single-nucleotide variants. In several cancer types, the number of mutated motifs correlated both with the free energies of base stacking and the energies required for abstracting an electron from the target guanines (ionization potentials). Similar correlations were also evident for the pathological missense and nonsense germline mutations, but only when the target guanines were located on the non-transcribed DNA strand. Likewise, pathogenic splicing mutations predominantly affected positions in which a purine was located on the non-transcribed DNA strand. Novel candidate driver mutations and tissue-specific mutational patterns were also identified in the cancer datasets. We conclude that electron transfer reactions within the DNA molecule contribute to sequence context-dependent mutagenesis, involving both somatic driver and passenger mutations in cancer, as well as germline alterations causing or associated with inherited disease.This work was supported by grants from the NIH (CA097175 and CA093729) to KMV, NCI/NIH contract HHSN261200800001E to AB and the Frederick National Laboratory for Cancer Research, and CBIIT/caBIG ISRCE yellow task #09-260 to the Frederick National Laboratory for Cancer Research. DNC and EVB received financial support from BIOBASE GmbH through a license agreement (for HGMD) with Cardiff University. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.PharmacyEmail: [email protected]
The Role of Methylation in the Intrinsic Dynamics of B- and Z-DNA
Methylation of cytosine at the 5-carbon position (5mC) is observed in both prokaryotes and eukaryotes. In humans, DNA methylation at CpG sites plays an important role in gene regulation and has been implicated in development, gene silencing, and cancer. In addition, the CpG dinucleotide is a known hot spot for pathologic mutations genome-wide. CpG tracts may adopt left-handed Z-DNA conformations, which have also been implicated in gene regulation and genomic instability. Methylation facilitates this B-Z transition but the underlying mechanism remains unclear. Herein, four structural models of the dinucleotide d(GC)5 repeat sequence in B-, methylated B-, Z-, and methylated Z-DNA forms were constructed and an aggregate 100 nanoseconds of molecular dynamics simulations in explicit solvent under physiological conditions was performed for each model. Both unmethylated and methylated B-DNA were found to be more flexible than Z-DNA. However, methylation significantly destabilized the BII, relative to the BI, state through the Gp5mC steps. In addition, methylation decreased the free energy difference between B- and Z-DNA. Comparisons of α/γ backbone torsional angles showed that torsional states changed marginally upon methylation for B-DNA, and Z-DNA. Methylation-induced conformational changes and lower energy differences may contribute to the transition to Z-DNA by methylated, over unmethylated, B-DNA and may be a contributing factor to biological function
New Perspectives on DNA and RNA Triplexes As Effectors of Biological Activity.
Since the first description of the canonical B-form DNA double helix, it has been suggested that alternative DNA, DNA-RNA, and RNA structures exist and act as functional genomic elements. Indeed, over the past few years it has become clear that, in addition to serving as a repository for genetic information, genomic DNA elicits biological responses by adopting conformations that differ from the canonical right-handed double helix, and by interacting with RNA molecules to form complex secondary structures. This review focuses on recent advances on three-stranded (triplex) nucleic acids, with an emphasis on DNA-RNA and RNA-RNA interactions. Emerging work reveals that triplex interactions between noncoding RNAs and duplex DNA serve as platforms for delivering site-specific epigenetic marks critical for the regulation of gene expression. Additionally, an increasing body of genetic and structural studies demonstrates that triplex RNA-RNA interactions are essential for performing catalytic and regulatory functions in cellular nucleoprotein complexes, including spliceosomes and telomerases, and for enabling protein recoding during programmed ribosomal frameshifting. Thus, evidence is mounting that DNA and RNA triplex interactions are implemented to perform a range of diverse biological activities in the cell, some of which will be discussed in this review
- …