71 research outputs found

    GeneTide—Terra Incognita Discovery Endeavor: a new transcriptome focused member of the GeneCards/GeneNote suite of databases

    Get PDF
    GeneCards® is an automatically mined database of human genes that strives to create, along with its auxiliary databases—GeneLoc, GeneNote and GeneAnnot—the most inclusive resource of gene-centered information of the human genome. GeneTide, the Gene Terra Incognita Discovery Endeavor (http://genecards.weizmann.ac.il/genetide/), the newest addition to this family, is a transcriptome-focused database which aims to enhance GeneCards with additional expressed sequence tag (EST)-based genes. This is achieved by comprehensively mapping >85% of the ∼5.6 million human ESTs currently available at dbEST to known genes by means of data mining and integration of genomic resources including UniGene, DoTS, AceView and in-house resources. GeneTide thus creates comprehensive links between ESTs and GeneCards genes. Furthermore, groups of unassociated transcripts serve as a basis for defining novel EST-based GeneCards Candidates (EGCs). These EGCs, nearly 25 000 of which were defined in version 0.3 of GeneTide, are further annotated with various parameters, including splicing evidence and expression data extracted from the GeneNote database, to determine their validity as possible de novo genes

    Expoldb: expression linked polymorphism database with inbuilt tools for analysis of expression and simple repeats

    Get PDF
    BACKGROUND: Quantitative variation in gene expression has been proposed to underlie phenotypic variation among human individuals. A facilitating step towards understanding the basis for gene expression variability is associating genome wide transcription patterns with potential cis modifiers of gene expression. DESCRIPTION: EXPOLDB, a novel Database, is a new effort addressing this need by providing information on gene expression levels variability across individuals, as well as the presence and features of potentially polymorphic (TG/CA)(n )repeats. EXPOLDB thus enables associating transcription levels with the presence and length of (TG/CA)(n )repeats. One of the unique features of this database is the display of expression data for 5 pairs of monozygotic twins, which allows identification of genes whose variability in expression, are influenced by non-genetic factors including environment. In addition to queries by gene name, EXPOLDB allows for queries by a pathway name. Users can also upload their list of HGNC (HUGO (The Human Genome Organisation) Gene Nomenclature Committee) symbols for interrogating expression patterns. The online application 'SimRep' can be used to find simple repeats in a given nucleotide sequence. To help illustrate primary applications, case examples of Housekeeping genes and the RUNX gene family, as well as one example of glycolytic pathway genes are provided. CONCLUSION: The uniqueness of EXPOLDB is in facilitating the association of genome wide transcription variations with the presence and type of polymorphic repeats while offering the feature for identifying genes whose expression variability are influenced by non genetic factors including environment. In addition, the database allows comprehensive querying including functional information on biochemical pathways of the human genes. EXPOLDB can be accessed a

    GIFtS: annotation landscape analysis with GeneCards

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene annotation is a pivotal component in computational genomics, encompassing prediction of gene function, expression analysis, and sequence scrutiny. Hence, quantitative measures of the annotation landscape constitute a pertinent bioinformatics tool. GeneCards<sup>® </sup>is a gene-centric compendium of rich annotative information for over 50,000 human gene entries, building upon 68 data sources, including Gene Ontology (GO), pathways, interactions, phenotypes, publications and many more.</p> <p>Results</p> <p>We present the GeneCards Inferred Functionality Score (GIFtS) which allows a quantitative assessment of a gene's annotation status, by exploiting the unique wealth and diversity of GeneCards information. The GIFtS tool, linked from the GeneCards home page, facilitates browsing the human genome by searching for the annotation level of a specified gene, retrieving a list of genes within a specified range of GIFtS value, obtaining random genes with a specific GIFtS value, and experimenting with the GIFtS weighting algorithm for a variety of annotation categories. The bimodal shape of the GIFtS distribution suggests a division of the human gene repertoire into two main groups: the high-GIFtS peak consists almost entirely of protein-coding genes; the low-GIFtS peak consists of genes from all of the categories. Cluster analysis of GIFtS annotation vectors provides the classification of gene groups by detailed positioning in the annotation arena. GIFtS also provide measures which enable the evaluation of the databases that serve as GeneCards sources. An inverse correlation is found (for GIFtS>25) between the number of genes annotated by each source, and the average GIFtS value of genes associated with that source. Three typical source prototypes are revealed by their GIFtS distribution: genome-wide sources, sources comprising mainly highly annotated genes, and sources comprising mainly poorly annotated genes. The degree of accumulated knowledge for a given gene measured by GIFtS was correlated (for GIFtS>30) with the number of publications for a gene, and with the seniority of this entry in the HGNC database.</p> <p>Conclusion</p> <p>GIFtS can be a valuable tool for computational procedures which analyze lists of large set of genes resulting from wet-lab or computational research. GIFtS may also assist the scientific community with identification of groups of uncharacterized genes for diverse applications, such as delineation of novel functions and charting unexplored areas of the human genome.</p

    Hypomethylation and aberrant expression of the glioma pathogenesis-related 1 gene in Wilms tumors

    Get PDF
    Wilms tumors (WTs) have a complex etiology, displaying genetic and epigenetic changes, including loss of imprinting (LOI) and tumor suppressor gene silencing. To identify new regions of epigenetic perturbation in WTs, we screened kidney and tumor DNA using CpG island (CGI) tags associated with cancer-specific DNA methylation changes. One such tag corresponded to a paralog of the glioma pathogenesis-related 1/related to testis-specific, vespid, and pathogenesis proteins 1 (GLIPR1/RTVP-1) gene, previously reported to be a tumor-suppressor gene silenced by hypermethylation in prostate cancer. Here we report methylation analysis of the GLIPR1/RTVP-1 gene in WTs and normal fetal and pediatric kidneys. Hypomethylation of the GLIPR1/RTVP-1 5′-region in WTs relative to normal tissue is observed in 21/24 (87.5%) of WTs analyzed. Quantitative analysis of GLIPR1/RTVP-1 expression in 24 WTs showed elevated transcript levels in 16/24 WTs (67%), with 12 WTs displaying in excess of 20-fold overexpression relative to fetal kidney (FK) control samples. Immunohistochemical analysis of FK and WT corroborates the RNA expression data and reveals high GLIPR1/RTVP-1 in WT blastemal cells together with variable levels in stromal and epithelial components. Hypomethylation is also evident in the WT precursor lesions and nephrogenic rests (NRs), supporting a role for GLIPR1/RTVP-1 deregulation early in Wilms tumorigenesis. Our data show that, in addition to gene dosage changes arising from LOI and hypermethylation-induced gene silencing, gene activation resulting from hypomethylation is also prevalent in WTs. Copyright © 2007 Neoplasia Press, Inc. All rights reserved

    Gradual transition from mosaic to global DNA methylation patterns during deuterostome evolution

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>DNA methylation by the Dnmt family occurs in vertebrates and invertebrates, including ascidians, and is thought to play important roles in gene regulation and genome stability, especially in vertebrates. However, the global methylation patterns of vertebrates and invertebrates are distinctive. Whereas almost all CpG sites are methylated in vertebrates, with the exception of those in CpG islands, the ascidian genome contains approximately equal amounts of methylated and unmethylated regions. Curiously, methylation status can be reliably estimated from the local frequency of CpG dinucleotides in the ascidian genome. Methylated and unmethylated regions tend to have few and many CpG sites, respectively, consistent with our knowledge of the methylation status of CpG islands and other regions in mammals. However, DNA methylation patterns and levels in vertebrates and invertebrates have not been analyzed in the same way.</p> <p>Results</p> <p>Using a new computational methodology based on the decomposition of the bimodal distributions of methylated and unmethylated regions, we estimated the extent of the global methylation patterns in a wide range of animals. We then examined the epigenetic changes <it>in silico </it>along the phylogenetic tree. We observed a gradual transition from fractional to global patterns of methylation in deuterostomes, rather than a clear demarcation between vertebrates and invertebrates. When we applied this methodology to six piscine genomes, some of which showed features similar to those of invertebrates.</p> <p>Conclusions</p> <p>The mammalian global DNA methylation pattern was probably not acquired at an early stage of vertebrate evolution, but gradually expanded from that of a more ancient organism.</p

    Novel retrotransposed imprinted locus identified at human 6p25

    Get PDF
    Differentially methylated regions (DMRs) are stable epigenetic features within or in proximity to imprinted genes. We used this feature to identify candidate human imprinted loci by quantitative DNA methylation analysis. We discovered a unique DMR at the 5′-end of FAM50B at 6p25.2. We determined that sense transcripts originating from the FAM50B locus are expressed from the paternal allele in all human tissues investigated except for ovary, in which expression is biallelic. Furthermore, an antisense transcript, FAM50B-AS, was identified to be monoallelically expressed from the paternal allele in a variety of tissues. Comparative phylogenetic analysis showed that FAM50B orthologs are absent in chicken and platypus, but are present and biallelically expressed in opossum and mouse. These findings indicate that FAM50B originated in Therians after divergence from Prototherians via retrotransposition of a gene on the X chromosome. Moreover, our data are consistent with acquisition of imprinting during Eutherian evolution after divergence of Glires from the Euarchonta mammals. FAM50B expression is deregulated in testicular germ cell tumors, and loss of imprinting occurs frequently in testicular seminomas, suggesting an important role for FAM50B in spermatogenesis and tumorigenesis. These results also underscore the importance of accounting for parental origin in understanding the mechanism of 6p25-related diseases

    Mammalian Small Nucleolar RNAs Are Mobile Genetic Elements

    Get PDF
    Small nucleolar RNAs (snoRNAs) of the H/ACA box and C/D box categories guide the pseudouridylation and the 2′-O-ribose methylation of ribosomal RNAs by forming short duplexes with their target. Similarly, small Cajal body–specific RNAs (scaRNAs) guide modifications of spliceosomal RNAs. The vast majority of vertebrate sno/scaRNAs are located in introns of genes transcribed by RNA polymerase II and processed by exonucleolytic trimming after splicing. A bioinformatic search for orthologues of human sno/scaRNAs in sequenced mammalian genomes reveals the presence of species- or lineage-specific sno/scaRNA retroposons (sno/scaRTs) characterized by an A-rich tail and an ∼14-bp target site duplication that corresponds to their insertion site, as determined by interspecific genomic alignments. Three classes of snoRTs are defined based on the extent of intron and exon sequences from the snoRNA parental host gene they contain. SnoRTs frequently insert in gene introns in the sense orientation at genomic hot spots shared with other genetic mobile elements. Previously characterized human snoRNAs are encoded in retroposons whose parental copies can be identified by phylogenic analysis, showing that snoRTs can be faithfully processed. These results identify snoRNAs as a new family of mobile genetic elements. The insertion of new snoRNA copies might constitute a safeguard mechanism by which the biological activity of snoRNAs is maintained in spite of the risk of mutations in the parental copy. I furthermore propose that retroposition followed by genetic drift is a mechanism that increased snoRNA diversity during vertebrate evolution to eventually acquire new RNA-modification functions

    Smchd1-Dependent and -Independent Pathways Determine Developmental Dynamics of CpG Island Methylation on the Inactive X Chromosome

    Get PDF
    X chromosome inactivation involves multiple levels of chromatin modification, established progressively and in a stepwise manner during early development. The chromosomal protein Smchd1 was recently shown to play an important role in DNA methylation of CpG islands (CGIs), a late step in the X inactivation pathway that is required for long-term maintenance of gene silencing. Here we show that inactive X chromosome (Xi) CGI methylation can occur via either Smchd1-dependent or -independent pathways. Smchd1-dependent CGI methylation, the primary pathway, is acquired gradually over an extended period, whereas Smchd1-independent CGI methylation occurs rapidly after the onset of X inactivation. The de novo methyltransferase Dnmt3b is required for methylation of both classes of CGI, whereas Dnmt3a and Dnmt3L are dispensable. Xi CGIs methylated by these distinct pathways differ with respect to their sequence characteristics and immediate chromosomal environment. We discuss the implications of these results for understanding CGI methylation during development

    Monoallelic Expression of Multiple Genes in the CNS

    Get PDF
    The inheritance pattern of a number of major genetic disorders suggests the possible involvement of genes that are expressed from one allele and silent on the other, but such genes are difficult to detect. Since DNA methylation in regulatory regions is often a mark of gene silencing, we modified existing microarray-based assays to detect both methylated and unmethylated DNA sequences in the same sample, a variation we term the MAUD assay. We probed a 65 Mb region of mouse Chr 7 for gene-associated sequences that show two distinct DNA methylation patterns in the mouse CNS. Selected genes were then tested for allele-specific expression in clonal neural stem cell lines derived from reciprocal F1 (C57BL/6×JF1) hybrid mice. In addition, using a separate approach, we directly analyzed allele-specific expression of a group of genes interspersed within clusters of OlfR genes, since the latter are subject to allelic exclusion. Altogether, of the 500 known genes in the chromosomal region surveyed, five show monoallelic expression, four identified by the MAUD assay (Agc1, p (pink-eyed dilution), P4ha3 and Thrsp), and one by its proximity to OlfR genes (Trim12). Thrsp (thyroid hormone responsive SPOT14 homolog) is expressed in hippocampus, but the human protein homolog, S14, has also been implicated in aggressive breast cancer. Monoallelic expression of the five genes is not coordinated at a chromosome-wide level, but rather regulated at individual loci. Taken together, our results suggest that at least 1% of previously untested genes are subject to allelic exclusion, and demonstrate a dual approach to expedite their identification

    Epigenetic mechanisms in mammals

    Full text link
    corecore