67 research outputs found
Protein annotation as term categorization in the gene ontology using word proximity networks
We addressed BioCreAtIvE Task 2, the problem of annotation of a protein with a node in the Gene Ontology (GO). We approached the task as a problem of categorizing terms derived from the document neighborhood of the given protein in the given document into nodes in the GO based on the lexical overlaps with terms on GO nodes and terms identified as related to those nodes. The system incorporates NLP components such as a morphological normalizer, a named entity recognizer, a statistical term frequency analyzer, and an unsupervised method for expanding words associated with GO ids based on a probability measure that captures word proximity (Rocha, 2002). The categorization methodology uses our novel Gene Ontology Categorizer (GOC) methodology (Joslyn et al. 2004) to select GO nodes as cluster heads for the terms in the input set based on the structure of the GO. Pre-processing Swiss-Prot and TrEMBL IDs were provided as input identifiers for the protein, so we needed to establish a set of names by which that protein could be referenced in the text. We made use of both the gene name and protein names that are in Swiss-Prot itself, when available, and a collection of synonyms constructed by Procter & Gamble Company. The fallback case was to us
Repression of Germline Genes in \u3cem\u3eCaenorhabditis elegans\u3c/em\u3e Somatic Tissues by H3K9 Dimethylation of Their Promoters
Repression of germline-promoting genes in somatic cells is critical for somatic development and function. To study how germline genes are repressed in somatic tissues, we analyzed key histone modifications in three Caenorhabditis elegans synMuv B mutants, lin-15B, lin-35, and lin-37βall of which display ectopic expression of germline genes in the soma. LIN-35 and LIN-37 are members of the conserved DREAM complex. LIN-15B has been proposed to work with the DREAM complex but has not been shown biochemically to be a member of the complex. We found that, in wild-type worms, synMuv B target genes and germline genes are enriched for the repressive histone modification dimethylation of histone H3 on lysine 9 (H3K9me2) at their promoters. Genes with H3K9me2 promoter localization are evenly distributed across the autosomes, not biased toward autosomal arms, as are the broad H3K9me2 domains. Both synMuv B targets and germline genes display a dramatic reduction of H3K9me2 promoter localization in lin-15B mutants, but much weaker reduction in lin-35 and lin-37mutants. This difference between lin-15B and DREAM complex mutants likely represents a difference in molecular function for these synMuv B proteins. In support of the pivotal role of H3K9me2 in regulation of germline genes by LIN-15B, global loss of H3K9me2 but not H3K9me3 results in phenotypes similar to synMuv B mutants, high-temperature larval arrest, and ectopic expression of germline genes in the soma. We propose that LIN-15B-driven enrichment of H3K9me2 at promoters of germline genes contributes to repression of those genes in somatic tissues
Trans-generational epigenetic regulation of C. elegans primordial germ cells
<p>Abstract</p> <p>Background</p> <p>The processes through which the germline maintains its continuity across generations has long been the focus of biological research. Recent studies have suggested that germline continuity can involve epigenetic regulation, including regulation of histone modifications. However, it is not clear how histone modifications generated in one generation can influence the transcription program and development of germ cells of the next.</p> <p>Results</p> <p>We show that the histone H3K36 methyltransferase maternal effect sterile (MES)-4 is an epigenetic modifier that prevents aberrant transcription activity in <it>Caenorhabditis elegans </it>primordial germ cells (PGCs). In <it>mes-4 </it>mutant PGCs, RNA Pol II activation is abnormally regulated and the PGCs degenerate. Genetic and genomewide analyses of MES-4-mediated H3K36 methylation suggest that MES-4 activity can operate independently of ongoing transcription, and may be predominantly responsible for maintenance methylation of H3K36 in germline-expressed loci.</p> <p>Conclusions</p> <p>Our data suggest a model in which MES-4 helps to maintain an 'epigenetic memory' of transcription that occurred in germ cells of previous generations, and that MES-4 and its epigenetic product are essential for normal germ cell development.</p
The Histone H3K36 Methyltransferase MES-4 Acts Epigenetically to Transmit the Memory of Germline Gene Expression to Progeny
Methylation of histone H3K36 in higher eukaryotes is mediated by multiple methyltransferases. Set2-related H3K36 methyltransferases are targeted to genes by association with RNA Polymerase II and are involved in preventing aberrant transcription initiation within the body of genes. The targeting and roles of the NSD family of mammalian H3K36 methyltransferases, known to be involved in human developmental disorders and oncogenesis, are not known. We used genome-wide chromatin immunoprecipitation (ChIP) to investigate the targeting and roles of the Caenorhabditis elegans NSD homolog MES-4, which is maternally provided to progeny and is required for the survival of nascent germ cells. ChIP analysis in early C. elegans embryos revealed that, consistent with immunostaining results, MES-4 binding sites are concentrated on the autosomes and the leftmost βΌ2% (300 kb) of the X chromosome. MES-4 overlies the coding regions of approximately 5,000 genes, with a modest elevation in the 5β² regions of gene bodies. Although MES-4 is generally found over Pol II-bound genes, analysis of gene sets with different temporal-spatial patterns of expression revealed that Pol II association with genes is neither necessary nor sufficient to recruit MES-4. In early embryos, MES-4 associates with genes that were previously expressed in the maternal germ line, an interaction that does not require continued association of Pol II with those loci. Conversely, Pol II association with genes newly expressed in embryos does not lead to recruitment of MES-4 to those genes. These and other findings suggest that MES-4, and perhaps the related mammalian NSD proteins, provide an epigenetic function for H3K36 methylation that is novel and likely to be unrelated to ongoing transcription. We propose that MES-4 transmits the memory of gene expression in the parental germ line to offspring and that this memory role is critical for the PGCs to execute a proper germline program
Recommended from our members
Singular value decomposition and density estimation for filtering and analysis of gene expression
We present three algorithms for gene expression analysis. Algorithm 1, known as serial correlation test, is used for filtering out noisy gene expression profiles. Algorithm 2 and 3 project the gene expression profiles into 2-dimensional expression subspaces ident ifiecl by Singular Value Decomposition. Density estimates a e used to determine expression profiles that have a high correlation with the subspace and low levels of noise. High density regions in the projection, clusters of co-expressed genes, are identified. We illustrate the algorithms by application to the yeast cell-cycle data by Cho et.al. and comparison of the results
H4K20me1 Contributes to Downregulation of X-Linked Genes for C. elegans Dosage Compensation
The Caenorhabditis elegans dosage compensation complex (DCC) equalizes X-chromosome gene dosage between XO males and XX hermaphrodites by two-fold repression of X-linked gene expression in hermaphrodites. The DCC localizes to the X chromosomes in hermaphrodites but not in males, and some subunits form a complex homologous to condensin. The mechanism by which the DCC downregulates gene expression remains unclear. Here we show that the DCC controls the methylation state of lysine 20 of histone H4, leading to higher H4K20me1 and lower H4K20me3 levels on the X chromosomes of XX hermaphrodites relative to autosomes. We identify the PR-SET7 ortholog SET-1 and the Suv4-20 ortholog SET-4 as the major histone methyltransferases for monomethylation and di/trimethylation of H4K20, respectively, and provide evidence that X-chromosome enrichment of H4K20me1 involves inhibition of SET-4 activity on the X. RNAi knockdown of set-1 results in synthetic lethality with dosage compensation mutants and upregulation of X-linked gene expression, supporting a model whereby H4K20me1 functions with the condensin-like C. elegans DCC to repress transcription of X-linked genes. H4K20me1 is important for mitotic chromosome condensation in mammals, suggesting that increased H4K20me1 on the X may restrict access of the transcription machinery to X-linked genes via chromatin compaction
An inverse relationship to germline transcription defines centromeric chromatin in C. elegans
Centromeres are chromosomal loci that direct segregation of the genome during cell division. The histone H3 variant CENP-A (also known as CenH3) defines centromeres in monocentric organisms, which confine centromere activity to a discrete chromosomal region, and holocentric organisms, which distribute centromere activity along the chromosome length1β3. Because the highly repetitive DNA found at most centromeres is neither necessary nor sufficient for centromere function, stable inheritance of CENP-A nucleosomal chromatin is postulated to epigenetically propagate centromere identity4. Here, we show that in the holocentric nematode Caenorhabditis elegans pre-existing CENP-A nucleosomes are not necessary to guide recruitment of new CENP-A nucleosomes. This is indicated by lack of CENP-A transmission by sperm during fertilization and by removal and subsequent reloading of CENP-A during oogenic meiotic prophase. Genome-wide mapping of CENP-A location in embryos and quantification of CENP-A molecules in nuclei revealed that CENP-A is incorporated at low density in domains that cumulatively encompass half the genome. Embryonic CENP-A domains are established in a pattern inverse to regions that are transcribed in the germline and early embryo, and ectopic transcription of genes in a mutant germline altered the pattern of CENP-A incorporation in embryos. Furthermore, regions transcribed in the germline but not embryos fail to incorporate CENP-A throughout embryogenesis. We propose that germline transcription defines genomic regions that exclude CENP-A incorporation in progeny, and that zygotic transcription during early embryogenesis remodels and reinforces this basal pattern. These findings link centromere identity to transcription and shed light on the evolutionary plasticity of centromeres
An assessment of histone-modification antibody quality
We have tested the specificity and utility of more than 200 antibodies raised against 57 different histone modifications in Drosophila melanogaster, Caenorhabditis elegans and human cells. Although most antibodies performed well, more than 25% failed specificity tests by dot blot or western blot. Among specific antibodies, more than 20% failed in chromatin immunoprecipitation experiments. We advise rigorous testing of histone-modification antibodies before use, and we provide a website for posting new test results (http://compbio.med.harvard.edu/antibodies/)
An assessment of histone-modification antibody quality
We have tested the specificity and utility of more than 200 antibodies raised against 57 different histone modifications in Drosophila melanogaster, Caenorhabditis elegans and human cells. Although most antibodies performed well, more than 25% failed specificity tests by dot blot or western blot. Among specific antibodies, more than 20% failed in chromatin immunoprecipitation experiments. We advise rigorous testing of histone-modification antibodies before use, and we provide a website for posting new test results (http://compbio.med.harvard.edu/antibodies/)
- β¦