6 research outputs found

    Noise reduction in protein-protein interaction graphs by the implementation of a novel weighting scheme

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent technological advances applied to biology such as yeast-two-hybrid, phage display and mass spectrometry have enabled us to create a detailed map of protein interaction networks. These interaction networks represent a rich, yet noisy, source of data that could be used to extract meaningful information, such as protein complexes. Several interaction network weighting schemes have been proposed so far in the literature in order to eliminate the noise inherent in interactome data. In this paper, we propose a novel weighting scheme and apply it to the <it>S. cerevisiae </it>interactome. Complex prediction rates are improved by up to 39%, depending on the clustering algorithm applied.</p> <p>Results</p> <p>We adopt a two step procedure. During the first step, by applying both novel and well established protein-protein interaction (PPI) weighting methods, weights are introduced to the original interactome graph based on the confidence level that a given interaction is a true-positive one. The second step applies clustering using established algorithms in the field of graph theory, as well as two variations of Spectral clustering. The clustered interactome networks are also cross-validated against the confirmed protein complexes present in the MIPS database.</p> <p>Conclusions</p> <p>The results of our experimental work demonstrate that interactome graph weighting methods clearly improve the clustering results of several clustering algorithms. Moreover, our proposed weighting scheme outperforms other approaches of PPI graph weighting.</p

    Predictive integration of gene functional similarity and co-expression defines treatment response of endothelial progenitor cells

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Endothelial progenitor cells (EPCs) have been implicated in different processes crucial to vasculature repair, which may offer the basis for new therapeutic strategies in cardiovascular disease. Despite advances facilitated by functional genomics, there is a lack of systems-level understanding of treatment response mechanisms of EPCs. In this research we aimed to characterize the EPCs response to adenosine (Ado), a cardioprotective factor, based on the systems-level integration of gene expression data and prior functional knowledge. Specifically, we set out to identify novel biosignatures of Ado-treatment response in EPCs.</p> <p>Results</p> <p>The predictive integration of gene expression data and standardized functional similarity information enabled us to identify new treatment response biosignatures. Gene expression data originated from Ado-treated and -untreated EPCs samples, and functional similarity was estimated with Gene Ontology (GO)-based similarity information. These information sources enabled us to implement and evaluate an integrated prediction approach based on the concept of <it>k</it>-nearest neighbours learning (<it>k</it>NN). The method can be executed by expert- and data-driven input queries to guide the search for biologically meaningful biosignatures. The resulting <it>integrated kNN </it>system identified new candidate EPC biosignatures that can offer high classification performance (areas under the operating characteristic curve > 0.8). We also showed that the proposed models can outperform those discovered by standard gene expression analysis. Furthermore, we report an initial independent <it>in vitro </it>experimental follow-up, which provides additional evidence of the potential validity of the top biosignature.</p> <p>Conclusion</p> <p>Response to Ado treatment in EPCs can be accurately characterized with a new method based on the combination of gene co-expression data and GO-based similarity information. It also exploits the incorporation of human expert-driven queries as a strategy to guide the automated search for candidate biosignatures. The proposed biosignature improves the systems-level characterization of EPCs. The new integrative predictive modeling approach can also be applied to other phenotype characterization or biomarker discovery problems.</p

    Selecting Negative Samples for PPI Prediction Using Hierarchical Clustering Methodology

    Get PDF
    Protein-protein interactions (PPIs) play a crucial role in cellular processes. In the present work, a new approach is proposed to construct a PPI predictor training a support vector machine model through a mutual information filter-wrapper parallel feature selection algorithm and an iterative and hierarchical clustering to select a relevance negative training set. By means of a selected suboptimum set of features, the constructed support vector machine model is able to classify PPIs with high accuracy in any positive and negative datasets

    Coordinated modular functionality and prognostic potential of a heart failure biomarker-driven interaction network

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The identification of potentially relevant biomarkers and a deeper understanding of molecular mechanisms related to heart failure (HF) development can be enhanced by the implementation of biological network-based analyses. To support these efforts, here we report a global network of protein-protein interactions (PPIs) relevant to HF, which was characterized through integrative bioinformatic analyses of multiple sources of "omic" information.</p> <p>Results</p> <p>We found that the structural and functional architecture of this PPI network is highly modular. These network modules can be assigned to specialized processes, specific cellular regions and their functional roles tend to partially overlap. Our results suggest that HF biomarkers may be defined as key coordinators of intra- and inter-module communication. Putative biomarkers can, in general, be distinguished as "information traffic" mediators within this network. The top high traffic proteins are encoded by genes that are not highly differentially expressed across HF and non-HF patients. Nevertheless, we present evidence that the integration of expression patterns from high traffic genes may support accurate prediction of HF. We quantitatively demonstrate that intra- and inter-module functional activity may be controlled by a family of transcription factors known to be associated with the prevention of hypertrophy.</p> <p>Conclusion</p> <p>The systems-driven analysis reported here provides the basis for the identification of potentially novel biomarkers and understanding HF-related mechanisms in a more comprehensive and integrated way.</p

    Méthodes sémantiques pour la comparaison inter-espèces de voies métaboliques (application au métabolisme des lipides chez l'humain, la souris et la poule)

    Get PDF
    La comparaison inter-espèces de voies métaboliques est une problématique importante en biologie. Actuellement, les connaissances sont générées à partir d'expériences sur un nombre relativement limité d'espèces dites modèles. Mieux connaître une espèce permet de valider ou non une inférence faite à partir de ces données expérimentales et de déterminer si ou dans quelle mesure des résultats obtenus sur une espèce modèle peuvent être transposés à une autre espèce. Cette thèse propose une méthode de comparaison inter-espèces de voies métaboliques. Elle compare chaque étape d'une voie métabolique en exploitant les annotations dans Gene Ontology qui leur sont associées. Ce travail valide l'intérêt des mesures de similarités sémantiques pour interpréter ces annotations, propose d'utiliser conjointement une mesure de particularité sémantique et propose une méthode basée sur des motifs de similarité et de particularité pour interpréter chaque étape de voie métabolique. De nombreuses mesures sémantiques quantifient la similarité entre des produits de gènes en fonction des annotations qu'ils ont en commun. Nous en avons identifié et utilisé une adaptée à la problématique de comparaison inter-espèces. En se focalisant sur la part commune aux produits de gènes comparés, les mesures de similarité sémantiques ignorent les caractéristiques spécifiques d'un seul produit de gène. Or la comparaison inter-espèces de voies métaboliques se doit de quantifier non seulement la similarité des produits de gènes qui interviennent dans celles-ci, mais également leurs particularités. Nous avons développé une mesure de particularité sémantique répondant à cette problématique. Pour chaque étape de voie métabolique, nous calculons un profil composé de sa valeur de similarité et de ses deux valeurs de particularité sémantiques. Il n'est pas possible d'établir formellement que deux produits de gènes sont similaires ou que l'un d'eux a des particularités significatives sans disposer d'un seuil de similarité et d'un seuil de particularité. Jusqu'à présent, ces interprétations se faisaient sur la base d'un seuil implicite ou arbitraire. Pour combler ce manque, nous avons développé une méthode de définition de seuils pour les mesures de similarité et de particularité sémantiques. Nous avons enfin appliqué une mesure de similarité inter-espèces et notre mesure de particularité pour comparer le métabolisme des lipides entre l'Homme, la souris et la poule. Nous avons pu interpréter les résultats à l'aide des seuils que nous avions définis. Chez les trois espèces, des particularités ont pu être observées, y compris au niveau de produits de gènes similaires. Elles concernent notamment des processus biologiques et des composants cellulaires. Les fonctions moléculaires présentent une forte similarité et peu de particularités. Ces résultats sont biologiquement pertinents.Cross-species comparison of metabolic pathways is an important task in biology. It is a major stake for both human health and agronomy. Currently, knowledge is acquired from some experiments on a relatively low number of species referred to as models''. A better understanding of a species determines whether to validate or not an inference made from these experimental data. It also determines whether or to what extent results obtained on model species can be transposed to another species. This thesis proposes a cross-species metabolic pathways comparison method. Our method compares each step of a metabolic pathway using the associated Gene Ontology annotations. This work validates the interest of the semantic similarity measures for interpreting these annotations, proposes to use jointly a semantic particularity measure and proposes a method based on similarity and particularity patterns to interpret each metabolic pathway step. Several gene products are involved throughout a metabolic pathway. They are associated to some annotations in order to describe their biological roles. Based on a shared ontology, these annotations allow to compare data from different species and to take into account several level of abstraction. Several semantic measures quantifying the similarity between gene products from their annotations have been developed previously. We have identified and used a semantic similarity measure appropriate for cross-species comparisons. Because they focus on the common part of the compared gene products, the semantic similarity measures ignore their specific characteristics. Therefore, cross-species metabolic pathways comparison has to quantify not only the similarity of the gene products involved, but also their particularity. We have developed a semantic particularity measure addressing this issue. For each pathway step, we proposed to create a profile combining its semantic similarity and its two semantic particularity values. Concerning the results interpretation, it is not possible to establish formally that two gene products are similar or that one of them have some significant particularities without having a similarity threshold and a particularity threshold. So far, these interpretations were based on an implicit or an arbitrary threshold. To address this gap, we developed a threshold definition method for the semantic similarity and particularity measures. We last applied a cross-species similarity measure and our particularity measure to compare the lipid metabolism between human, mice and chicken. We then interpreted the results using the previously defined thresholds. In all three species, we observed some particularities, including on similar genes. They concerned notably some biological processes and cellular components. The molecular functions present a strong similarity and few particularities. These results are biologically relevant.RENNES1-Bibl. électronique (352382106) / SudocSudocFranceF

    Predictive integration of Gene Ontology-driven similarity and functional interactions

    No full text
    There is a need to develop methods to automatically incorporate prior knowledge to support the prediction and validation of novel functional associations. One such important source is represented by the Gene Ontology (GO) ™ and the many model organism databases of gene products annotated to the GO. We investigated quantitative relationships between the GO-driven similarity of genes and their functional interactions by analyzing different types of associations in Saccharomyces cerevisiae and Caenorhabditis elegans. Interacting genes exhibited significantly higher levels of GO-driven similarity (GOS) in comparison to random pairs of genes used as a surrogate for negative interactions. The Biological Process hierarchy provides more reliable results for co-regulatory and protein-protein interactions. GOS represent a relevant resource to support prediction of functional networks in combination with other resources. 1
    corecore