13 research outputs found

    A novel hypothesis-unbiased method for gene ontology enrichment based on transcriptome data

    Get PDF
    Gene Ontology (GO) classification of statistically significantly differentially expressed genes is commonly used to interpret transcriptomics data as a part of functional genomic analysis. In this approach, all significantly expressed genes contribute equally to the final GO classification regardless of their actual expression levels. Gene expression levels can significantly affect protein production and hence should be reflected in GO term enrichment. Genes with low expression levels can also participate in GO term enrichment through cumulative effects. In this report, we have introduced a new GO enrichment method that is suitable for multiple samples and time series experiments that uses a statistical outlier test to detect GO categories with special patterns of variation that can potentially identify candidate biological mechanisms. To demonstrate the value of our approach, we have performed two case studies. Whole transcriptome expression profiles of Salmonella enteritidis and Alzheimer's disease (AD) were analysed in order to determine GO term enrichment across the entire transcriptome instead of a subset of differentially expressed genes used in traditional GO analysis. Our result highlights the key role of inflammation related functional groups in AD pathology as granulocyte colony-stimulating factor receptor binding, neuromedin U binding, and interleukin were remarkably upregulated in AD brain when all using all of the gene expression data in the transcriptome. Mitochondrial components and the molybdopterin synthase complex were identified as potential key cellular components involved in AD pathology.Mario Fruzangohar, Esmaeil Ebrahimie, David L. Adelso

    Improved part-of-speech prediction in suffix analysis

    Get PDF
    MotivationPredicting the part of speech (POS) tag of an unknown word in a sentence is a significant challenge. This is particularly difficult in biomedicine, where POS tags serve as an input to training sophisticated literature summarization techniques, such as those based on Hidden Markov Models (HMM). Different approaches have been taken to deal with the POS tagger challenge, but with one exception--the TnT POS tagger--previous publications on POS tagging have omitted details of the suffix analysis used for handling unknown words. The suffix of an English word is a strong predictor of a POS tag for that word. As a pre-requisite for an accurate HMM POS tagger for biomedical publications, we present an efficient suffix prediction method for integration into a POS tagger.ResultsWe have implemented a fully functional HMM POS tagger using experimentally optimised suffix based prediction. Our simple suffix analysis method, significantly outperformed the probability interpolation based TnT method. We have also shown how important suffix analysis can be for probability estimation of a known word (in the training corpus) with an unseen POS tag; a common scenario with a small training corpus. We then integrated this simple method in our POS tagger and determined an optimised parameter set for both methods, which can help developers to optimise their current algorithm, based on our results. We also introduce the concept of counting methods in maximum likelihood estimation for the first time and show how counting methods can affect the prediction result. Finally, we describe how machine-learning techniques were applied to identify words, for which prediction of POS tags were always incorrect and propose a method to handle words of this type.Availability and implementationJava source code, binaries and setup instructions are freely available at http://genomes.sapac.edu.au/text_mining/pos_tagger.zip.Mario Fruzangohar, Trent A. Kroeger, David L. Adelso

    Gene Ontology-based analysis of zebrafish’ omics data using the web tool Comparative Gene Ontology

    Get PDF
    Gene Ontology (GO) analysis is a powerful tool in systems biology, which uses a defined nomenclature to annotate genes/proteins within three categories: ‘‘Molecular Function,’’ ‘‘Biological Process,’’ and ‘‘Cellular Component.’’ GOanalysis can assist in revealing functionalmechanisms underlying observed patterns in transcriptomic, genomic, and proteomic data. The already extensive and increasing use of zebrafish for modeling genetic and other diseases highlights the need to develop a GO analytical tool for this organism. The web tool Comparative GO was originally developed for GO analysis of bacterial data in 2013 (www.comparativego.com). We have now upgraded and elaborated this web tool for analysis of zebrafish genetic data using GOs and annotations from the Gene Ontology Consortium.Esmaeil Ebrahimie, Mario Fruzangohar, Seyyed Hani Moussavi-Nik, and Morgan Newma

    A transcription factor contributes to pathogenesis and virulence in streptococcus pneumoniae

    Get PDF
    To date, the role of transcription factors (TFs) in the progression of disease for many pathogens is yet to be studied in detail. This is probably due to transient, and generally low expression levels of TFs, which are the central components controlling the expression of many genes during the course of infection. However, a small change in the expression or specificity of a TF can radically alter gene expression. In this study, we combined a number of quality-based selection strategies including structural prediction of modulated genes, gene ontology and network analysis, to predict the regulatory mechanisms underlying pathogenesis of Streptococcus pneumoniae (the pneumococcus). We have identified two TFs (SP_0676 and SP_0927 [SmrC]) that might control tissue-specific gene expression during pneumococcal translocation from the nasopharynx to lungs, to blood and then to brain of mice. Targeted mutagenesis and mouse models of infection confirmed the role of SP_0927 in pathogenesis and virulence, and suggests that SP_0676 might be essential to pneumococcal viability. These findings provide fundamental new insights into virulence gene expression and regulation during pathogenesis.Layla K. Mahdi, Esmaeil Ebrahimie, David L. Adelson, James C. Paton, Abiodun D. Ogunniy

    Comparative GO: a web application for comparative Gene Ontology and Gene Ontology-based gene selection in bacteria

    Get PDF
    Extent: 8p.The primary means of classifying new functions for genes and proteins relies on Gene Ontology (GO), which defines genes/proteins using a controlled vocabulary in terms of their Molecular Function, Biological Process and Cellular Component. The challenge is to present this information to researchers to compare and discover patterns in multiple datasets using visually comprehensible and user-friendly statistical reports. Importantly, while there are many GO resources available for eukaryotes, there are none suitable for simultaneous, graphical and statistical comparison between multiple datasets. In addition, none of them supports comprehensive resources for bacteria. By using Streptococcus pneumoniae as a model, we identified and collected GO resources including genes, proteins, taxonomy and GO relationships from NCBI, UniProt and GO organisations. Then, we designed database tables in PostgreSQL database server and developed a Java application to extract data from source files and loaded into database automatically. We developed a PHP web application based on Model-View-Control architecture, used a specific data structure as well as current and novel algorithms to estimate GO graphs parameters. We designed different navigation and visualization methods on the graphs and integrated these into graphical reports. This tool is particularly significant when comparing GO groups between multiple samples (including those of pathogenic bacteria) from different sources simultaneously. Comparing GO protein distribution among up- or down-regulated genes from different samples can improve understanding of biological pathways, and mechanism(s) of infection. It can also aid in the discovery of genes associated with specific function(s) for investigation as a novel vaccine or therapeutic targets.Mario Fruzangohar, Esmaeil Ebrahimie, Abiodun D. Ogunniyi, Layla K. Mahdi, James C. Paton, David L. Adelso

    Transcriptional regulatory network analysis of the over-expressed genes in adipose tissue

    No full text
    Adipose tissue plays important roles in whole body energy homeostasis and is now known to be a very important and active endocrine organ. The transcriptional regulatory network of adipose tissue metabolism is complex and much yet to be known. To identify transcriptional profile in adipose tissue, expressed sequence tag (EST) analysis using Digital Differential Display (DDD) was employed. The results of EST analysis were re-evaluated by microarray data using COXPRESdb (an available expression data repository). To uncover transcriptional regulatory mechanisms which play key roles in the adipose tissue metabolism, transcriptional regulatory network analysis was applied, using the promoter analysis and interaction network toolset. Sixty-five transcripts were found to be more frequent in adipose tissue in comparison to the other tissues. COXPRESdb result showed that 62 % of the identified over-expressed genes in adipose tissue by DDD had expression level greater than 1 (in base 2 logarithm). Based on coincidence of regulatory sites, candidate TFs were identified including TFs that previously known to be involved in adipose tissue metabolism (SP1, KROX, STAT1, LRF, VDR, LXR, SRF and HIF1) and TFs, such as CKROX, ZF5, ETF, AP-2, AP-2alpha, PAX-5, SPZ1, RBPJ and CACD, that had not been recognized previously. This work yielded several TF candidates activating in adipose tissue metabolism. These findings open a new avenue for future research on promoter occupancy and TF perturbation. © 2013 The Genetics Society of Korea.Mohammad Reza Bakhtiarizadeh, Mohammad Moradi-Shahrbabak, Esmaeil Ebrahimi
    corecore