71 research outputs found
Identification of Trace Element-Containing Proteins in Genomic Databases
Development of bioinformatics tools provided researchers with the ability to identify full sets of trace element–containing proteins in organisms for which complete genomic sequences are available. Recently, independent bioinformatics methods were used to identify all, or almost all, genes encoding selenocysteine-containing proteins in human, mouse, and Drosophila genomes, characterizing entire selenoproteomes in these organisms. It also should be possible to search for entire sets of other trace element–associated proteins, such as metal-containing proteins, although methods for their identification are still in development
Recommended from our members
Analysis of cancer genomes reveals basic features of human aging and its role in cancer development
Somatic mutations have long been implicated in aging and disease, but their impact on fitness and function is difficult to assess. Here by analysing human cancer genomes we identify mutational patterns associated with aging. Our analyses suggest that age-associated mutation load and burden double approximately every 8 years, similar to the all-cause mortality doubling time. This analysis further reveals variance in the rate of aging among different human tissues, for example, slightly accelerated aging of the reproductive system. Age-adjusted mutation load and burden correlate with the corresponding cancer incidence and precede it on average by 15 years, pointing to pre-clinical cancer development times. Behaviour of mutation load also exhibits gender differences and late-life reversals, explaining some gender-specific and late-life patterns in cancer incidence rates. Overall, this study characterizes some features of human aging and offers a mechanism for age being a risk factor for the onset of cancer
ARTICLE Pooled Association Tests for Rare Variants in Exon-Resequencing Studies
Deep sequencing will soon generate comprehensive sequence information in large disease samples. Although the power to detect association with an individual rare variant is limited, pooling variants by gene or pathway into a composite test provides an alternative strategy for identifying susceptibility genes. We describe a statistical method for detecting association of multiple rare variants in protein-coding genes with a quantitative or dichotomous trait. The approach is based on the regression of phenotypic values on individuals' genotype scores subject to a variable allele-frequency threshold, incorporating computational predictions of the functional effects of missense variants. Statistical significance is assessed by permutation testing with variable thresholds. We used a rigorous population-genetics simulation framework to evaluate the power of the method, and we applied the method to empirical sequencing data from three disease studies
Selenoprotein gene nomenclature
The human genome contains 25 genes coding for selenocysteine-containing proteins (selenoproteins). These proteins are involved in a variety of functions, most notably redox homeostasis. Selenoprotein enzymes with known functions are designated according to these functions: TXNRD1, TXNRD2, and TXNRD3 (thioredoxin reductases), GPX1, GPX2, GPX3, GPX4 and GPX6 (glutathione peroxidases), DIO1, DIO2, and DIO3 (iodothyronine deiodinases), MSRB1 (methionine-R-sulfoxide reductase 1) and SEPHS2 (selenophosphate synthetase 2). Selenoproteins without known functions have traditionally been denoted by SEL or SEP symbols. However, these symbols are sometimes ambiguous and conflict with the approved nomenclature for several other genes. Therefore, there is a need to implement a rational and coherent nomenclature system for selenoprotein-encoding genes. Our solution is to use the root symbol SELENO followed by a letter. This nomenclature applies to SELENOF (selenoprotein F, the 15 kDa selenoprotein, SEP15), SELENOH (selenoprotein H, SELH, C11orf31), SELENOI (selenoprotein I, SELI, EPT1), SELENOK (selenoprotein K, SELK), SELENOM (selenoprotein M, SELM), SELENON (selenoprotein N, SEPN1, SELN), SELENOO (selenoprotein O, SELO), SELENOP (selenoprotein P, SeP, SEPP1, SELP), SELENOS (selenoprotein S, SELS, SEPS1, VIMP), SELENOT (selenoprotein T, SELT), SELENOV (selenoprotein V, SELV) and SELENOW (selenoprotein W, SELW, SEPW1). This system, approved by the HUGO Gene Nomenclature Committee, also resolves conflicting, missing and ambiguous designations for selenoprotein genes and is applicable to selenoproteins across vertebrates
ARID1B is a specific vulnerability in ARID1A-mutant cancers
Summary Recent studies have revealed that ARID1A is frequently mutated across a wide variety of human cancers and also has bona fide tumor suppressor properties. Consequently, identification of vulnerabilities conferred by ARID1A mutation would have major relevance for human cancer. Here, using a broad screening approach, we identify ARID1B, a related but mutually exclusive homolog of ARID1A in the SWI/SNF chromatin remodeling complex, as the number one gene preferentially required for the survival of ARID1A-mutant cancer cell lines. We show that loss of ARID1B in ARID1A-deficient backgrounds destabilizes SWI/SNF and impairs proliferation. Intriguingly, we also find that ARID1A and ARID1B are frequently co-mutated in cancer, but that ARID1A-deficient cancers retain at least one ARID1B allele. These results suggest that loss of ARID1A and ARID1B alleles cooperatively promotes cancer formation but also results in a unique functional dependence. The results further identify ARID1B as a potential therapeutic target for ARID1A-mutant cancers
Opposing effects of cancer-type-specific SPOP mutants on BET protein degradation and sensitivity to BET inhibitors.
It is generally assumed that recurrent mutations within a given cancer driver gene elicit similar drug responses. Cancer genome studies have identified recurrent but divergent missense mutations affecting the substrate-recognition domain of the ubiquitin ligase adaptor SPOP in endometrial and prostate cancers. The therapeutic implications of these mutations remain incompletely understood. Here we analyzed changes in the ubiquitin landscape induced by endometrial cancer-associated SPOP mutations and identified BRD2, BRD3 and BRD4 proteins (BETs) as SPOP-CUL3 substrates that are preferentially degraded by endometrial cancer-associated SPOP mutants. The resulting reduction of BET protein levels sensitized cancer cells to BET inhibitors. Conversely, prostate cancer-specific SPOP mutations resulted in impaired degradation of BETs, promoting their resistance to pharmacologic inhibition. These results uncover an oncogenomics paradox, whereby mutations mapping to the same domain evoke opposing drug susceptibilities. Specifically, we provide a molecular rationale for the use of BET inhibitors to treat patients with endometrial but not prostate cancer who harbor SPOP mutations
Recommended from our members
Mutational heterogeneity in cancer and the search for new cancer genes
Major international projects are now underway aimed at creating a comprehensive catalog of all genes responsible for the initiation and progression of cancer. These studies involve sequencing of matched tumor–normal samples followed by mathematical analysis to identify those genes in which mutations occur more frequently than expected by random chance. Here, we describe a fundamental problem with cancer genome studies: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds. The list includes many implausible genes (such as those encoding olfactory receptors and the muscle protein titin), suggesting extensive false positive findings that overshadow true driver events. Here, we show that this problem stems largely from mutational heterogeneity and provide a novel analytical methodology, MutSigCV, for resolving the problem. We apply MutSigCV to exome sequences from 3,083 tumor-normal pairs and discover extraordinary variation in (i) mutation frequency and spectrum within cancer types, which shed light on mutational processes and disease etiology, and (ii) mutation frequency across the genome, which is strongly correlated with DNA replication timing and also with transcriptional activity. By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and allow true cancer genes to rise to attention
Identification and Classification of Conserved RNA Secondary Structures in the Human Genome
The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general comparative genomics method based on phylogenetic stochastic context-free grammars for identifying functional RNAs encoded in the human genome and used it to survey an eight-way genome-wide alignment of the human, chimpanzee, mouse, rat, dog, chicken, zebra-fish, and puffer-fish genomes for deeply conserved functional RNAs. At a loose threshold for acceptance, this search resulted in a set of 48,479 candidate RNA structures. This screen finds a large number of known functional RNAs, including 195 miRNAs, 62 histone 3′UTR stem loops, and various types of known genetic recoding elements. Among the highest-scoring new predictions are 169 new miRNA candidates, as well as new candidate selenocysteine insertion sites, RNA editing hairpins, RNAs involved in transcript auto regulation, and many folds that form singletons or small functional RNA families of completely unknown function. While the rate of false positives in the overall set is difficult to estimate and is likely to be substantial, the results nevertheless provide evidence for many new human functional RNAs and present specific predictions to facilitate their further characterization
New Mammalian Selenocysteine-containing Proteins Identified with an Algorithm That Searches for Selenocysteine Insertion Sequence Elements
Mammalian selenium-containing proteins identified thus far contain selenium in the form of a selenocysteine residue encoded by UGA. These proteins lack common amino acid sequence motifs, but 3’-untranslated regions of selenoprotein genes contain a common stem-loop structure, selenocysteine insertion sequence (SECIS) element, that is necessary for decoding UGA as selenocysteine rather than a stop signal. We describe here a computer program, SECISearch, that identifies mammalian selenoprotein genes by recognizing SECIS elements on the basis of their primary and secondary structures and free energy requirements. When SECISearch was applied to search human dbEST, two new mammalian selenoproteins, designated SelT and SelR, were identified. We determined their cDNA sequences and expressed them in a monkey cell line as fusion proteins with a green fluorescent protein. Incorporation of selenium into new proteins was confirmed by metabolic labeling with 75Se, and expression of SelT was additionally documented in immunoblot assays. SelT and SelR did not have homology to previously characterized proteins, but their putative homologs were detected in various organisms. SelR homologs were present in every organism characterized by complete genome sequencing. The data suggest applicability of SECISearch for identification of new selenoprotein genes in nucleotide data bases
- …