7 research outputs found

    GO Explorer: A gene-ontology tool to aid in the interpretation of shotgun proteomics data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Spectral counting is a shotgun proteomics approach comprising the identification and relative quantitation of thousands of proteins in complex mixtures. However, this strategy generates bewildering amounts of data whose biological interpretation is a challenge.</p> <p>Results</p> <p>Here we present a new algorithm, termed GO Explorer (GOEx), that leverages the gene ontology (GO) to aid in the interpretation of proteomic data. GOEx stands out because it combines data from protein fold changes with GO over-representation statistics to help draw conclusions. Moreover, it is tightly integrated within the PatternLab for Proteomics project and, thus, lies within a complete computational environment that provides parsers and pattern recognition tools designed for spectral counting. GOEx offers three independent methods to query data: an interactive directed acyclic graph, a specialist mode where key words can be searched, and an automatic search. Its usefulness is demonstrated by applying it to help interpret the effects of perillyl alcohol, a natural chemotherapeutic agent, on glioblastoma multiform cell lines (A172). We used a new multi-surfactant shotgun proteomic strategy and identified more than 2600 proteins; GOEx pinpointed key sets of differentially expressed proteins related to cell cycle, alcohol catabolism, the Ras pathway, apoptosis, and stress response, to name a few.</p> <p>Conclusion</p> <p>GOEx facilitates organism-specific studies by leveraging GO and providing a rich graphical user interface. It is a simple to use tool, specialized for biologists who wish to analyze spectral counting data from shotgun proteomics. GOEx is available at <url>http://pcarvalho.com/patternlab</url>.</p

    Integration of the Gene Ontology into an object-oriented architecture

    Get PDF
    BACKGROUND: To standardize gene product descriptions, a formal vocabulary defined as the Gene Ontology (GO) has been developed. GO terms have been categorized into biological processes, molecular functions, and cellular components. However, there is no single representation that integrates all the terms into one cohesive model. Furthermore, GO definitions have little information explaining the underlying architecture that forms these terms, such as the dynamic and static events occurring in a process. In contrast, object-oriented models have been developed to show dynamic and static events. A portion of the TGF-beta signaling pathway, which is involved in numerous cellular events including cancer, differentiation and development, was used to demonstrate the feasibility of integrating the Gene Ontology into an object-oriented model. RESULTS: Using object-oriented models we have captured the static and dynamic events that occur during a representative GO process, "transforming growth factor-beta (TGF-beta) receptor complex assembly" (GO:0007181). CONCLUSION: We demonstrate that the utility of GO terms can be enhanced by object-oriented technology, and that the GO terms can be integrated into an object-oriented model by serving as a basis for the generation of object functions and attributes

    NGF-mediated transcriptional targets of p53 in PC12 neuronal differentiation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>p53 is recognized as a critical regulator of the cell cycle and apoptosis. Mounting evidence also suggests a role for p53 in differentiation of cells including neuronal precursors. We studied the transcriptional role of p53 during nerve growth factor-induced differentiation of the PC12 line into neuron-like cells. We hypothesized that p53 contributed to PC12 differentiation through the regulation of gene targets distinct from its known transcriptional targets for apoptosis or DNA repair.</p> <p>Results</p> <p>Using a genome-wide chromatin immunoprecipitation cloning technique, we identified and validated 14 novel p53-regulated genes following NGF treatment. The data show p53 protein was transcriptionally activated and contributed to NGF-mediated neurite outgrowth during differentiation of PC12 cells. Furthermore, we describe stimulus-specific regulation of a subset of these target genes by p53. The most salient differentiation-relevant target genes included <it>wnt7b </it>involved in dendritic extension and the <it>tfcp2l4/grhl3 </it>grainyhead homolog implicated in ectodermal development. Additional targets included <it>brk</it>, <it>sdk2</it>, <it>sesn3</it>, <it>txnl2</it>, <it>dusp5</it>, <it>pon3</it>, <it>lect1</it>, <it>pkcbpb15 </it>and other genes.</p> <p>Conclusion</p> <p>Within the PC12 neuronal context, putative p53-occupied genomic loci spanned the entire <it>Rattus norvegicus </it>genome upon NGF treatment. We conclude that receptor-mediated p53 transcriptional activity is involved in PC12 differentiation and may suggest a contributory role for p53 in neuronal development.</p

    Automated Natural-Language Processing for Integration and Functional Annotation of Complex Biological Systems.

    Full text link
    This dissertation discusses the use of automated natural language processing (NLP) for characterization of biomolecular events in signal transduction pathway databases. I also discuss the use of a dynamic map engine for efficiently navigating large biomedical document collections and functionally annotating high-throughput genomic data. An application is presented where NLP software, beginning with genomic expression data, automatically identifies and joins disparate experimental observations supporting biochemical interaction relationships between candidate genes in the Wnt signaling pathway. I discuss the need for accurate named entity resolution to the biological sequence databases and how sequence-based approaches can unambiguously link automatically-extracted assertions to their respective biomolecules in a high-speed manner. I then demonstrate a search engine, BioSearch-2D, which renders the contents of large biomedical document collections into a single, dynamic map. With this engine, the prostate cancer epigenetics literature is analyzed and I demonstrate that the summarization map closely matches that provided by expert human review articles. Examples include displays which prominently feature genes such as the androgen receptor and glutathione S-transferase P1 together with the National Library of Medicine’s Medical Subject Heading (MeSH) descriptions which match the roles described for those genes in the human review articles. In a second application of BioSearch-2D, I demonstrate the engine’s application as a context-specific functional annotation system for cancer-related gene signatures. Our engine matches the annotation produced by a Gene Ontology-based annotation engine for 6 cancer-related gene signatures. Additionally, it assigns highly-significant MeSH terms as annotation for the gene list which are not produced by the GO-based engine. I find that the BioSearch-2D display facilitates both the exploration of large document collections in the biomedical literature as well as provides users with an accurate annotation engine for ad-hoc gene sets. In the future, the use of both large-scale biomedical literature summarization engines and automated protein-protein interaction discovery software could greatly assist manual and expensive data curation efforts involving describing complex biological processes or disease states.Ph.D.BioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/58394/1/csantos_1.pd

    Ontology-Driven and Network–Enabled Systems Biology Case Studies

    Get PDF
    With the progress in high-throughput technologies and bioinformatics in recent years, it is possible to determine to what extent genetic or environmental manipulation of a biological system affects the expression of thousands of genes and proteins. This study requires a shift from the conventional pure hypothesis-driven approach to an integrated approach--systems biology method. Systems biology studies the relationships and interactions between various parts of a biological system. It allows individual genes or proteins to be placed in a global context of cellular functions. This analysis can answer the question of how networks of genes/proteins, differentially regulated respond to genetic or environmental modification, are placed in the global context of the protein interaction map. In this project, we establish a protein interaction network-based systems biology approach, and use the method for two case studies. In particular, our systems biology studies consist of the following parts: (1) Analysis of mass-spectrometry derived proteomics experimental data to identify differentially expressed proteins in different genetic or environmental conditions; (2) Integration of genomics and proteomics data with experimental results, the molecular context of protein-protein interaction networks and gene functional categories; (3) Visual interpretation of molecular networks. Our approach has been validated in two case studies by comparing our discoveries with existing findings. We also obtained new insights. In the first case study, the proteomes of cisplatin-sensitive and cisplatin-resistant ovarian cancer cells were compared and we observed that cellular physiological process is significantly activated in cisplatin-resistant cell lines, and this response arises from endogenous, abiotic, and stress-related signals. We found that cisplatin-resistant cell lines demonstrated unusually high level of protein-binding activities, and a broad spectrum of across-the-board drug-binding and nucleotide-binding mechanisms are all activated. In the second case study, we found that the significantly enriched GO categories included genes that are related to Grr1 perturbation induced morphological phenotype change are highly connected in the GO sub-network, which implies that Grr1 could be affecting this process by affecting a small core group of proteins. These biological discoveries support the significance of developing a common framework of evaluating functional genomics and proteomics data, using networks and systems approaches
    corecore