Search CORE

14,845 research outputs found

Mining Phenotypes for Protein Function Prediction

Author: Groth Philip
Leser Ulf
Pohlenz Hans-Dieter
Weiss Bertram
Publication venue: Dagstuhl Seminar Proceedings. 08131 - Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives
Publication date: 01/01/2008
Field of study

Until very recently, phenotypes only very rarely were studied in a systematic manner. While ontologies for describing gene functions now have a 10 year long tradition, similar vocabularies for describing the phenotype of genes are only emerging now; similarly, the techniques for determining phenotypes on a large scale (especially RNAi) are available only for a few years, while genomic sequencing or gene expression studies are already established for a much longer time. In this talk, we describe results from a study for exploiting phenotype descriptions for protein function prediction. We used the data from PhenomicsDB, a phenotype database integrated from several publicly available data sources. Due to the lack of standardization, phenotypes in PhenomicsDB can only be viewed as text (short statements, abstracts, singular terms, ...). We clustered these texts and analyzed the corresponding gene clusters in terms of their coherence in functional annotation and their interconnectedness by protein-protein-interactions. We also devised a method for using the close similarity in their phenotype descriptions to predict the function of proteins. We show that this methods yields a very good precision at acceptable coverage

Dagstuhl Research Online Publication Server

Global landscape of mouse and human cytokine transcriptional regulation

Author: Fuxman Bass Juan Ignacio
Gan Kok Ann
Imedio Alvaro Dafonte
Martinez Melissa
Mehta Shivani
Pro Sebastian Carrasco
Santoso Clarissa Stephanie
Sereda Rebecca
Sewell Jared Allan
Publication venue: 'Oxford University Press (OUP)'
Publication date: 12/10/2018
Field of study

Cytokines are cell-to-cell signaling proteins that play a central role in immune development, pathogen responses, and diseases. Cytokines are highly regulated at the transcriptional level by combinations of transcription factors (TFs) that recruit cofactors and the transcriptional machinery. Here, we mined through three decades of studies to generate a comprehensive database, CytReg, reporting 843 and 647 interactions between TFs and cytokine genes, in human and mouse respectively. By integrating CytReg with other functional datasets, we determined general principles governing the transcriptional regulation of cytokine genes. In particular, we show a correlation between TF connectivity and immune phenotype and disease, we discuss the balance between tissue-specific and pathogen-activated TFs regulating each cytokine gene, and cooperativity and plasticity in cytokine regulation. We also illustrate the use of our database as a blueprint to predict TF–disease associations and identify potential TF–cytokine regulatory axes in autoimmune diseases. Finally, we discuss research biases in cytokine regulation studies, and use CytReg to predict novel interactions based on co-expression and motif analyses which we further validated experimentally. Overall, this resource provides a framework for the rational design of future cytokine gene regulation studies.National Institutes of Health (NIH) [R00 GM114296 and R35 GM128625 to J.I.F.B., 5T32HL007501-34 to J.A.S.]; National Science Foundation [NSF-REU BIO-1659605 to M.M.]. Funding for open access charge: NIH [R35 GM128625]. (R00 GM114296 - National Institutes of Health (NIH); R35 GM128625 - National Institutes of Health (NIH); 5T32HL007501-34 - National Institutes of Health (NIH); NSF-REU BIO-1659605 - National Science Foundation; R35 GM128625 - NIH)Published versio

Boston University Institutional Repository (OpenBU)

Random walks on mutual microRNA-target gene interaction network improve the prediction of disease-associated microRNAs

Author: Dinh-Toi Chu Dinh-Toi Chu
Duc-Hau Le Duc-Hau Le
Le Hoang Son Le Hoang Son
Van-Huy Pham Van-Huy Pham
Verbeke Lieven
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Background: MicroRNAs (miRNAs) have been shown to play an important role in pathological initiation, progression and maintenance. Because identification in the laboratory of disease-related miRNAs is not straightforward, numerous network-based methods have been developed to predict novel miRNAs in silico. Homogeneous networks (in which every node is a miRNA) based on the targets shared between miRNAs have been widely used to predict their role in disease phenotypes. Although such homogeneous networks can predict potential disease-associated miRNAs, they do not consider the roles of the target genes of the miRNAs. Here, we introduce a novel method based on a heterogeneous network that not only considers miRNAs but also the corresponding target genes in the network model. Results: Instead of constructing homogeneous miRNA networks, we built heterogeneous miRNA networks consisting of both miRNAs and their target genes, using databases of known miRNA-target gene interactions. In addition, as recent studies demonstrated reciprocal regulatory relations between miRNAs and their target genes, we considered these heterogeneous miRNA networks to be undirected, assuming mutual miRNA-target interactions. Next, we introduced a novel method (RWRMTN) operating on these mutual heterogeneous miRNA networks to rank candidate disease-related miRNAs using a random walk with restart (RWR) based algorithm. Using both known disease-associated miRNAs and their target genes as seed nodes, the method can identify additional miRNAs involved in the disease phenotype. Experiments indicated that RWRMTN outperformed two existing state-of-the-art methods: RWRMDA, a network-based method that also uses a RWR on homogeneous (rather than heterogeneous) miRNA networks, and RLSMDA, a machine learning-based method. Interestingly, we could relate this performance gain to the emergence of "disease modules" in the heterogeneous miRNA networks used as input for the algorithm. Moreover, we could demonstrate that RWRMTN is stable, performing well when using both experimentally validated and predicted miRNA-target gene interaction data for network construction. Finally, using RWRMTN, we identified 76 novel miRNAs associated with 23 disease phenotypes which were present in a recent database of known disease-miRNA associations. Conclusions: Summarizing, using random walks on mutual miRNA-target networks improves the prediction of novel disease-associated miRNAs because of the existence of "disease modules" in these networks

Ghent University Academic Bibliography

The BioGRID Interaction Database: 2011 update

Author: A. Chatr-aryamontri
A. Winter
B.-J. Breitkreutz
Behrends
Bork
Breitkreutz
Breitkreutz
C. Stark
Cline
Costanzo
Drabkin
Hertz-Fowler
Howe
J. M. Rust
J. Nixon
K. Dolinski
K. Van Auken
Kerrien
L. Boucher
Leitner
M. S. Livstone
M. Tyers
Mering
M ller
R. Oughtred
Razick
T. Reguly
Wiederkehr
X. Shi
X. Wang
Yu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2011
Field of study

The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans (http://www.thebiogrid.org). BioGRID currently holds 347 966 interactions (170 162 genetic, 177 804 protein) curated from both high-throughput data sets and individual focused studies, as derived from over 23 000 publications in the primary literature. Complete coverage of the entire literature is maintained for budding yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharomyces pombe) and thale cress (Arabidopsis thaliana), and efforts to expand curation across multiple metazoan species are underway. The BioGRID houses 48 831 human protein interactions that have been curated from 10 247 publications. Current curation drives are focused on particular areas of biology to enable insights into conserved networks and pathways that are relevant to human health. The BioGRID 3.0 web interface contains new search and display features that enable rapid queries across multiple data types and sources. An automated Interaction Management System (IMS) is used to prioritize, coordinate and track curation across international sites and projects. BioGRID provides interaction data to several model organism databases, resources such as Entrez-Gene and other interaction meta-databases. The entire BioGRID 3.0 data collection may be downloaded in multiple file formats, including PSI MI XML. Source code for BioGRID 3.0 is freely available without any restrictions

CiteSeerX

Crossref

PubMed Central

Edinburgh Research Explorer

Caltech Authors

Recommended from our members

Heterogeneous network embedding enabling accurate disease association predictions.

Author: Guo Mengjie
Kong Xiangnan
Ruan Lu
Tang Chunlei
Wang Wei
Xiong Yun
Zhu Yangyong
Publication venue: eScholarship, University of California
Publication date: 01/12/2019
Field of study

BackgroundIt is significant to identificate complex biological mechanisms of various diseases in biomedical research. Recently, the growing generation of tremendous amount of data in genomics, epigenomics, metagenomics, proteomics, metabolomics, nutriomics, etc., has resulted in the rise of systematic biological means of exploring complex diseases. However, the disparity between the production of the multiple data and our capability of analyzing data has been broaden gradually. Furthermore, we observe that networks can represent many of the above-mentioned data, and founded on the vector representations learned by network embedding methods, entities which are in close proximity but at present do not actually possess direct links are very likely to be related, therefore they are promising candidate subjects for biological investigation.ResultsWe incorporate six public biological databases to construct a heterogeneous biological network containing three categories of entities (i.e., genes, diseases, miRNAs) and multiple types of edges (i.e., the known relationships). To tackle the inherent heterogeneity, we develop a heterogeneous network embedding model for mapping the network into a low dimensional vector space in which the relationships between entities are preserved well. And in order to assess the effectiveness of our method, we conduct gene-disease as well as miRNA-disease associations predictions, results of which show the superiority of our novel method over several state-of-the-arts. Furthermore, many associations predicted by our method are verified in the latest real-world dataset.ConclusionsWe propose a novel heterogeneous network embedding method which can adequately take advantage of the abundant contextual information and structures of heterogeneous network. Moreover, we illustrate the performance of the proposed method on directing studies in biology, which can assist in identifying new hypotheses in biological investigation

eScholarship - University of California

Global Functional Atlas of \u3cem\u3eEscherichia coli\u3c/em\u3e Encompassing Previously Uncharacterized Proteins

Author: Ali Mehrab
Babu Mohan
Butland Gareth
Chandran Shamanta
Christopolous Constantine
Emili Andrew
Eroukova Veronika
Golshani Ashkan
Greenblatt Jack F.
Guao Xinghua
Hu Pingzhao
Janga Sarah Chandra
Moreno-Hagelsieb Gabriel
Musso Gabriela
Nazarians-Armavil Anaies
Nazemof Nazila
Paccanaro Alberto
Phanse Sadhna
Pogoutse Oxana
Wong Peter
Yang Wenhong
Publication venue: Scholars Commons @ Laurier
Publication date: 01/04/2009
Field of study

One-third of the 4,225 protein-coding genes of Escherichia coli K-12 remain functionally unannotated (orphans). Many map to distant clades such as Archaea, suggesting involvement in basic prokaryotic traits, whereas others appear restricted to E. coli, including pathogenic strains. To elucidate the orphans’ biological roles, we performed an extensive proteomic survey using affinity-tagged E. coli strains and generated comprehensive genomic context inferences to derive a high-confidence compendium for virtually the entire proteome consisting of 5,993 putative physical interactions and 74,776 putative functional associations, most of which are novel. Clustering of the respective probabilistic networks revealed putative orphan membership in discrete multiprotein complexes and functional modules together with annotated gene products, whereas a machine-learning strategy based on network integration implicated the orphans in specific biological processes. We provide additional experimental evidence supporting orphan participation in protein synthesis, amino acid metabolism, biofilm formation, motility, and assembly of the bacterial cell envelope. This resource provides a “systems-wide” functional blueprint of a model microbe, with insights into the biological and evolutionary significance of previously uncharacterized proteins

Wilfrid Laurier University