Search CORE

6,150 research outputs found

A new census of protein tandem repeats and their relationship with intrinsic disorder

Author: Anisimova Maria
Delucchi Matteo
Elofsson Arne
Sachenkova Oxana
Schaper Elke
Publication venue: 'MDPI AG'
Publication date: 09/04/2020
Field of study

Protein tandem repeats (TRs) are often associated with immunity-related functions and diseases. Since that last census of protein TRs in 1999, the number of curated proteins increased more than seven-fold and new TR prediction methods were published. TRs appear to be enriched with intrinsic disorder and vice versa. The significance and the biological reasons for this association are unknown. Here, we characterize protein TRs across all kingdoms of life and their overlap with intrinsic disorder in unprecedented detail. Using state-of-the-art prediction methods, we estimate that 50.9% of proteins contain at least one TR, often located at the sequence flanks. Positive linear correlation between the proportion of TRs and the protein length was observed universally, with Eukaryotes in general having more TRs, but when the difference in length is taken into account the difference is quite small. TRs were enriched with disorder-promoting amino acids and were inside intrinsically disordered regions. Many such TRs were homorepeats. Our results support that TRs mostly originate by duplication and are involved in essential functions such as transcription processes, structural organization, electron transport and iron-binding. In viruses, TRs are found in proteins essential for virulence

Multidisciplinary Digital Publishing Institute

ZHAW digitalcollection

PathFinder: mining signal transduction pathway segments from protein-protein interaction networks

Author: A Bateman
A Grigoriev
A Grigoriev
A Ruepp
A Zien
AC Gavin
AJ Walhout
AJ Walhout
AM Edwards
B Alberts
B Nelson
C Choi
C Stark
C von Mering
CA Ball
CL Tucker
CM Deane
F Campagne
G Bebek
G Bebek
Gurkan Bebek
HD Madhani
HG Dohlman
HW Mewes
J Reboul
J Scott
JB Konopka
Jiong Yang
JS Bader
K Sachs
L Giot
M Ashburner
M Deng
M Kanehisa
M Stefen
MJ Winters
NR Gough
P Uetz
R Agrawal
R Sharan
S Brin
S Fields
S Li
S Suthram
SR Neves
T Ideker
T Ito
T Ito
T Shlomi
TG Graeber
TS Kim
WK Huh
Y Ho
Y Liu
Y Qi
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background A Signal transduction pathway is the chain of processes by which a cell converts an extracellular signal into a response. In most unicellular organisms, the number of signal transduction pathways influences the number of ways the cell can react and respond to the environment. Discovering signal transduction pathways is an arduous problem, even with the use of systematic genomic, proteomic and metabolomic technologies. These techniques lead to an enormous amount of data and how to interpret and process this data becomes a challenging computational problem. Results In this study we present a new framework for identifying signaling pathways in protein-protein interaction networks. Our goal is to find biologically significant pathway segments in a given interaction network. Currently, protein-protein interaction data has excessive amount of noise, e.g., false positive and false negative interactions. First, we eliminate false positives in the protein-protein interaction network by integrating the network with microarray expression profiles, protein subcellular localization and sequence information. In addition, protein families are used to repair false negative interactions. Then the characteristics of known signal transduction pathways and their functional annotations are extracted in the form of association rules. Conclusion Given a pair of starting and ending proteins, our methodology returns candidate pathway segments between these two proteins with possible missing links (recovered false negatives). In our study, <it>S. cerevisiae </it>(yeast) data is used to demonstrate the effectiveness of our method.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Inferring mouse gene functions from genomic-scale data using a combined functional network/classification strategy

Author: Kim Wan Kyu
Krumpelman Chase
Marcotte Edward M
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

The complete set of mouse genes, as with the set of human genes, is still largely uncharacterized, with many pieces of experimental evidence accumulating regarding the activities and expression of the genes, but the majority of genes as yet still of unknown function. Within the context of the MouseFunc competition, we developed and applied two distinct large-scale data mining approaches to infer the functions (Gene Ontology annotations) of mouse genes from experimental observations from available functional genomics, proteomics, comparative genomics, and phenotypic data. The two strategies — the first using classifiers to map features to annotations, the second propagating annotations from characterized genes to uncharacterized genes along edges in a network constructed from the features — offer alternative and possibly complementary approaches to providing functional annotations. Here, we re-implement and evaluate these approaches and their combination for their ability to predict the proper functional annotations of genes in the MouseFunc data set. We show that, when controlling for the same set of input features, the network approach generally outperformed a naïve Bayesian classifier approach, while their combination offers some improvement over either independently. We make our observations of predictive performance on the MouseFunc competition hold-out set, as well as on a ten-fold cross-validation of the MouseFunc data. Across all 1,339 annotated genes in the MouseFunc test set, the median predictive power was quite strong (median area under a receiver operating characteristic plot of 0.865 and average precision of 0.195), indicating that a mining-based strategy with existing data is a promising path towards discovering mammalian gene functions. As one product of this work, a high-confidence subset of the functional mouse gene network was produced — spanning >70% of mouse genes with >1.6 million associations — that is predictive of mouse (and therefore often human) gene function and functional associations. The network should be generally useful for mammalian gene functional analyses, such as for predicting interactions, inferring functional connections between genes and pathways, and prioritizing candidate genes. The network and all predictions are available on the worldwide web

CiteSeerX

Crossref

Springer - Publisher Connector

PubMed Central

Texas ScholarWorks

An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker's Yeast, Saccharomyces cerevisiae

Background: Probabilistic functional gene networks are powerful theoretical frameworks for integrating heterogeneous functional genomics and proteomics data into objective models of cellular systems. Such networks provide syntheses of millions of discrete experimental observations, spanning DNA microarray experiments, physical protein interactions, genetic interactions, and comparative genomics; the resulting networks can then be easily applied to generate testable hypotheses regarding specific gene functions and associations. Methodology/Principal Findings: We report a significantly improved version (v. 2) of a probabilistic functional gene network [1] of the baker's yeast, Saccharomyces cerevisiae. We describe our optimization methods and illustrate their effects in three major areas: the reduction of functional bias in network training reference sets, the application of a probabilistic model for calculating confidences in pair-wise protein physical or genetic interactions, and the introduction of simple thresholds that eliminate many false positive mRNA co-expression relationships. Using the network, we predict and experimentally verify the function of the yeast RNA binding protein Puf6 in 60S ribosomal subunit biogenesis. Conclusions/Significance: YeastNet v. 2, constructed using these optimizations together with additional data, shows significant reduction in bias and improvements in precision and recall, in total covering 102,803 linkages among 5,483 yeast proteins (95% of the validated proteome). YeastNet is available from http://www.yeastnet.org.This work was supported by grants from the N.S.F. (IIS-0325116, EIA-0219061), N.I.H. (GM06779-01,GM076536-01), Welch (F-1515), and a Packard Fellowship (EMM). These agencies were not involved in the design and conduct of the study, in the collection, analysis, and interpretation of the data, or in the preparation, review, or approval of the manuscript.Cellular and Molecular Biolog

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Texas ScholarWorks

Complex biomarker discovery in neuroimaging data: Finding a needle in a haystack

Author: Atluri Gowtham
Doraiswamy P. Murali
Fang Gang
Kumar Vipin
Lim Kelvin
MacDonald Angus
Padmanabhan Kanchana
Petrella Jeffrey R.
Samatova Nagiza F.
Steinbach Michael
Publication venue: The Authors. Published by Elsevier Inc.
Publication date: 07/08/2013
Field of study

AbstractNeuropsychiatric disorders such as schizophrenia, bipolar disorder and Alzheimer's disease are major public health problems. However, despite decades of research, we currently have no validated prognostic or diagnostic tests that can be applied at an individual patient level. Many neuropsychiatric diseases are due to a combination of alterations that occur in a human brain rather than the result of localized lesions. While there is hope that newer imaging technologies such as functional and anatomic connectivity MRI or molecular imaging may offer breakthroughs, the single biomarkers that are discovered using these datasets are limited by their inability to capture the heterogeneity and complexity of most multifactorial brain disorders. Recently, complex biomarkers have been explored to address this limitation using neuroimaging data. In this manuscript we consider the nature of complex biomarkers being investigated in the recent literature and present techniques to find such biomarkers that have been developed in related areas of data mining, statistics, machine learning and bioinformatics

Elsevier - Publisher Connector

PubMed Central

Recent advances in clustering methods for protein interaction networks

Author: Deng Youping
Li Min
Pan Yi
Wang Jianxin
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

The increasing availability of large-scale protein-protein interaction data has made it possible to understand the basic components and organization of cell machinery from the network level. The arising challenge is how to analyze such complex interacting data to reveal the principles of cellular organization, processes and functions. Many studies have shown that clustering protein interaction network is an effective approach for identifying protein complexes or functional modules, which has become a major research topic in systems biology. In this review, recent advances in clustering methods for protein interaction networks will be presented in detail. The predictions of protein functions and interactions based on modules will be covered. Finally, the performance of different clustering methods will be compared and the directions for future research will be discussed

Crossref

ScholarWorks @ Georgia State University

Springer - Publisher Connector

PubMed Central