38 research outputs found

    Tissue enrichment analysis for C. elegans genomics

    Get PDF
    Background: Over the last ten years, there has been explosive development in methods for measuring gene expression. These methods can identify thousands of genes altered between conditions, but understanding these datasets and forming hypotheses based on them remains challenging. One way to analyze these datasets is to associate ontologies (hierarchical, descriptive vocabularies with controlled relations between terms) with genes and to look for enrichment of specific terms. Although Gene Ontology (GO) is available for Caenorhabditis elegans, it does not include anatomical information. Results: We have developed a tool for identifying enrichment of C. elegans tissues among gene sets and generated a website GUI where users can access this tool. Since a common drawback to ontology enrichment analyses is its verbosity, we developed a very simple filtering algorithm to reduce the ontology size by an order of magnitude. We adjusted these filters and validated our tool using a set of 30 gold standards from Expression Cluster data in WormBase. We show our tool can even discriminate between embryonic and larval tissues and can even identify tissues down to the single-cell level. We used our tool to identify multiple neuronal tissues that are down-regulated due to pathogen infection in C. elegans. Conclusions: Our Tissue Enrichment Analysis (TEA) can be found within WormBase, and can be downloaded using Python’s standard pip installer. It tests a slimmed-down C. elegans tissue ontology for enrichment of specific terms and provides users with a text and graphic representation of the results

    Two new functions in the WormBase Enrichment Suite

    Get PDF
    Genome-wide experiments routinely generate large amounts of data that can be hard to interpret biologically. A common approach to interpreting these results is to employ enrichment analyses of controlled languages, known as ontologies, that describe various biological parameters such as gene molecular or biological function. In C. elegans, three distinct ontologies, the Gene Ontology (GO), Anatomy Ontology (AO), and the Worm Phenotype Ontology (WPO) are used to annotate gene function, expression and phenotype, respectively (Ashburner et al. 2000; Lee and Sternberg, 2003; Schindelman et al. 2011). Previously, we developed software to test datasets for enrichment of anatomical terms, called the Tissue Enrichment Analysis (TEA) tool (Angeles-Albores and Sternberg, 2016). Using the same hypergeometric statistical method, we extend enrichment testing to include WPO and GO, offering a unified approach to enrichment testing in C. elegans. The WormBase Enrichment Suite can be accessed via a user-friendly interface at http://www.wormbase.org/tools/enrichment/tea/tea.cgi. To validate the tools, we analyzed a previously published extracellular vesicle (EV)-releasing neuron (EVN) signature gene set derived from dissociated ciliated EV neurons (Wang et al. 2015) using WormBase Enrichment Suite based on the WS262 WormBase release. TEA correctly identified the CEM, hook sensillum and IL2 neuron as enriched tissues. The top phenotype associated with the EVN signature was chemosensory behavior. Gene Ontology enrichment analysis showed that cell projection and cell body were the most enriched cellular components in this gene set, followed by the biological processes neuropeptide signaling pathway and vesicle localization further down. The tutorial script used to generate the figure above can be viewed at: https://github.com/dangeles/TissueEnrichmentAnalysis/blob/master/tutorial/Tutorial.ipynb The addition of Gene Enrichment Analysis (GEA) and Phenotype Enrichment Analysis (PEA) to WormBase marks an important step towards a unified set of analyses that can help researchers to understand genomic datasets. These enrichment analyses will allow the community to fully benefit from the data curation ongoing at WormBase

    WormBase 2017: Molting into a new stage

    Get PDF

    Interactome analysis of Caenorhabditis elegans synapses by TurboID-based proximity labeling

    Get PDF
    Proximity labeling provides a powerful in vivo tool to characterize the proteome of subcellular structures and the interactome of specific proteins. The nematode Caenorhabditis elegans is one of the most intensely studied organisms in biology, offering many advantages for biochemistry. Using the highly active biotin ligase TurboID, we optimize here a proximity labeling protocol for C. elegans. An advantage of TurboID is that biotin's high affinity for streptavidin means biotin-labeled proteins can be affinity-purified under harsh denaturing conditions. By combining extensive sonication with aggressive denaturation using SDS and urea, we achieved near-complete solubilization of worm proteins. We then used this protocol to characterize the proteomes of the worm gut, muscle, skin, and nervous system. Neurons are among the smallest C. elegans cells. To probe the method's sensitivity, we expressed TurboID exclusively in the two AFD neurons and showed that the protocol could identify known and previously unknown proteins expressed selectively in AFD. The active zones of synapses are composed of a protein matrix that is difficult to solubilize and purify. To test if our protocol could solubilize active zone proteins, we knocked TurboID into the endogenous elks-1 gene, which encodes a presynaptic active zone protein. We identified many known ELKS-1-interacting active zone proteins, as well as previously uncharacterized synaptic proteins. Versatile vectors and the inherent advantages of using C. elegans, including fast growth and the ability to rapidly make and functionally test knock-ins, make proximity labeling a valuable addition to the armory of this model organism

    Predicting gene essentiality in Caenorhabditis elegans by feature engineering and machine-learning

    Get PDF
    Defining genes that are essential for life has major implications for understanding critical biological processes and mechanisms. Although essential genes have been identified and characterised experimentally using functional genomic tools, it is challenging to predict with confidence such genes from molecular and phenomic data sets using computational methods. Using extensive data sets available for the model organism Caenorhabditis elegans, we constructed here a machine-learning (ML)-based workflow for the prediction of essential genes on a genome-wide scale. We identified strong predictors for such genes and showed that trained ML models consistently achieve highly-accurate classifications. Complementary analyses revealed an association between essential genes and chromosomal location. Our findings reveal that essential genes in C. elegans tend to be located in or near the centre of autosomal chromosomes; are positively correlated with low single nucleotide polymorphim (SNP) densities and epigenetic markers in promoter regions; are involved in protein and nucleotide processing; are transcribed in most cells; are enriched in reproductive tissues or are targets for small RNAs bound to the argonaut CSR-1. Based on these results, we hypothesise an interplay between epigenetic markers and small RNA pathways in the germline, with transcription-based memory; this hypothesis warrants testing. From a technical perspective, further work is needed to evaluate whether the present ML-based approach will be applicable to other metazoans (including Drosophila melanogaster) for which comprehensive data set (i.e. genomic, transcriptomic, proteomic, variomic, epigenetic and phenomic) are available

    Reconstructing a metazoan genetic pathway with transcriptome-wide epistasis measurements

    Get PDF
    RNA-sequencing (RNA-seq) is commonly used to identify genetic modules that respond to perturbations. In single cells, transcriptomes have been used as phenotypes, but this concept has not been applied to whole-organism RNA-seq. Also, quantifying and interpreting epistatic effects using expression profiles remains a challenge. We developed a single coefficient to quantify transcriptome-wide epistasis that reflects the underlying interactions and which can be interpreted intuitively. To demonstrate our approach, we sequenced four single and two double mutants of Caenorhabditis elegans. From these mutants, we reconstructed the known hypoxia pathway. In addition, we uncovered a class of 56 genes with HIF-1–dependent expression that have opposite changes in expression in mutants of two genes that cooperate to negatively regulate HIF-1 abundance; however, the double mutant of these genes exhibits suppression epistasis. This class violates the classical model of HIF-1 regulation but can be explained by postulating a role of hydroxylated HIF-1 in transcriptional control

    Identification of putative reader proteins of 5-methylcytosine and its derivatives in Caenorhabditis elegans RNA [version 1; peer review: 1 approved, 2 approved with reservations]

    Get PDF
    Background: Methylation of carbon-5 of cytosines (m5C) is a conserved post-transcriptional nucleotide modification of RNA with widespread distribution across organisms. It can be further modified to yield 5-hydroxymethylcytidine (hm5C), 5-formylcytidine (f5C), 2´-O-methyl-5-hydroxymethylcytidine (hm5Cm) and 2´-O-methyl-5-formylcytidine (f5Cm). How m5C, and specially its derivates, contribute to biology mechanistically is poorly understood. We recently showed that m5C is required for Caenorhabditis elegans development and fertility under heat stress. m5C has been shown to participate in mRNA transport and maintain mRNA stability through its recognition by the reader proteins ALYREF and YBX1, respectively. Hence, identifying readers for RNA modifications can enhance our understanding in the biological roles of these modifications. Methods: To contribute to the understanding of how m5C and its oxidative derivatives mediate their functions, we developed RNA baits bearing modified cytosines in diverse structural contexts to pulldown potential readers in C. elegans. Potential readers were identified using mass spectrometry. The interaction of two of the putative readers with m5C was validated using immunoblotting. Results: Our mass spectrometry analyses revealed unique binding proteins for each of the modifications. In silico analysis for phenotype enrichments suggested that hm5Cm unique readers are enriched in proteins involved in RNA processing, while readers for m5C, hm5C and f5C are involved in germline processes. We validated our dataset by demonstrating that the nematode ALYREF homologues ALY-1 and ALY-2 preferentially bind m5C in vitro. Finally, sequence alignment analysis showed that several of the putative m5C readers contain the conserved RNA recognition motif (RRM), including ALY-1 and ALY-2. Conclusions: The dataset presented here serves as an important scientific resource that will support the discovery of new functions of m5C and its derivatives. Furthermore, we demonstrate that ALY-1 and ALY-2 bind to m5C in C. elegans

    Transcriptomic, Functional, and Network Analyses Reveal Novel Genes Involved in the Interaction between \u3ci\u3eCaenorhabditis elegans\u3c/i\u3e and \u3ci\u3eStenotrophomonas maltophilia\u3c/i\u3e

    Get PDF
    The bacterivorous nematode Caenorhabditis elegans is an excellent model for the study of innate immune responses to a variety of bacterial pathogens, including the emerging nosocomial bacterial pathogen Stenotrophomonas maltophilia. The study of this interaction has ecological and medical relevance as S. maltophilia is found in association with C. elegans and other nematodes in the wild and is an emerging opportunistic bacterial pathogen. We identified 393 genes that were differentially expressed when exposed to virulent and avirulent strains of S.maltophilia and an avirulent strain of E. coli. We then used a probabilistic functional gene network model (WormNet) to determine that 118 of the 393 differentially expressed genes formed an interacting network and identified a set of highly connected genes with eight or more predicted interactions.We hypothesized that these highly connected genes might play an important role in the defense against S. maltophila and found that mutations of six of seven highly connected genes have a significant effect on nematode survival in response to these bacteria. Of these genes, C48B4.1, mpk-2, cpr-4, clec-67, and lys-6 are needed for combating the virulent S. maltophilia JCMS strain, while dod-22 was solely involved in response to the avirulent S. maltophilia K279a strain. We further found that dod-22 and clec-67 were up regulated in response to JCMS vs. K279a, while C48B4.1, mpk-2, cpr-4, and lys-6 were down regulated. Only dod-22 had a documented role in innate immunity, which demonstrates the merit of our approach in the identification of novel genes that are involved in combating S. maltophilia infection
    corecore