675 research outputs found

    Classes empiétantes dans un graphe et application aux interactions entre protéines

    No full text
    URL des Cahiers : https://halshs.archives-ouvertes.fr/CAHIERS-MSECahiers de la MSE 2005.32 - Série Bleue - ISSN : 1624-0340In this paper, we study a method of classification by density in an unweighted graph. We search some areas with a high density of edges, that can be overlapping (we don't try to obtain a partition but some intrinsic classes). The method consists of two steps; first we determine the cores of the classes by means of a local density function and then we extend these cores by their neighbourhoods following a criterion on the density of the classes. Finally, the method is applied on a protein-protein interaction network, with the aim of predicting unknown cellular functions of some proteins.Dans cet article, on étudie une méthode de classification reposant sur une recherche de zones denses en arêtes dans un graphe non pondéré. On ne cherche pas à faire un partitionnement mais à extraire des classes intrinsèques aux données, qui pourront donc être empiétantes. La méthode proposée est ensuite appliquée à un graphe d'interactions entre protéines, les classes mises en évidence pouvant permettre aux biologistes de prédire les fonctions cellulaires de certaines protéines

    Recherche de classes empiétantes dans un graphe : application aux réseaux d’interactions entre protéines

    Get PDF
    Cet article présente une méthode de classification empiétante permettant de mettre en évidence des zones denses en arêtes dans un graphe. On cherche plus précisément à extraire du graphe des sous-graphes dont la densité en arêtes soit élevée par rapport à la densité du graphe entier, ces sous-graphes pouvant avoir des sommets en commun. Cette méthode est appliquée à un problème issu de la biologique : l’annotation des protéines. Les graphes considérés traduisent alors des interactions observées entre les protéines. Partant du principe biologique que des protéines impliquées dans une même fonction cellulaire interagissent, les sous-graphes obtenus par l’application de la méthode de classification empiétante aux réseaux d’interactions donnent des indications sur les fonctions des protéines constituant ces sous-graphes, ce qui permet de fournir une aide informatique à la prédiction de fonctions inconnues de certaines protéines. Le caractère empitétant autorisé par la méthode présentée ici permet en particulier de prendre en compte le fait que les protéines peuvent être impliquées chacune dans plusieurs fonctions cellulaires.This article describes a method of overlapping classification, in order to compute zones which are dense in edges in a graph. More precisely, the aim is to compute subgraphs in which the density of edges is large compared to the edge-density of the whole graph. These subgraphs may share common vertices. This method is applied to a problem arising in biology: the annotation of proteins. The graphs then represent the observed interactions between proteins. Thanks to the biological principle that proteins involved in the same cellular function interact, the subgraphs provided when the method is applied to the protein-protein interactions networks provide information about the functions of proteins belonging to these subgraphs. This provides a computer-aided tool for the prediction of unknown functions of some proteins. The overlapping allowed by the method depicted here makes it possible to take into account the fact that each protein may be involved into several cellular functions

    High resolution, on-line identification of strains from the Mycobacterium tuberculosis complex based on tandem repeat typing

    Get PDF
    BACKGROUND: Currently available reference methods for the molecular epidemiology of the Mycobacterium tuberculosis complex either lack sensitivity or are still too tedious and slow for routine application. Recently, tandem repeat typing has emerged as a potential alternative. This report contributes to the development of tandem repeat typing for M. tuberculosis by summarising the existing data, developing additional markers, and setting up a freely accessible, fast, and easy to use, internet-based service for strain identification. RESULTS: A collection of 21 VNTRs incorporating 13 previously described loci and 8 newly evaluated markers was used to genotype 90 strains from the M. tuberculosis complex (M. tuberculosis (64 strains), M. bovis (9 strains including 4 BCG representatives), M. africanum (17 strains)). Eighty-four different genotypes are defined. Clustering analysis shows that the M. africanum strains fall into three main groups, one of which is closer to the M. tuberculosis strains, and an other one is closer to the M. bovis strains. The resulting data has been made freely accessible over the internet to allow direct strain identification queries. CONCLUSIONS: Tandem-repeat typing is a PCR-based assay which may prove to be a powerful complement to the existing epidemiological tools for the M. tuberculosis complex. The number of markers to type depends on the identification precision which is required, so that identification can be achieved quickly at low cost in terms of consumables, technical expertise and equipment

    Spatial correlations in attribute communities

    Get PDF
    Community detection is an important tool for exploring and classifying the properties of large complex networks and should be of great help for spatial networks. Indeed, in addition to their location, nodes in spatial networks can have attributes such as the language for individuals, or any other socio-economical feature that we would like to identify in communities. We discuss in this paper a crucial aspect which was not considered in previous studies which is the possible existence of correlations between space and attributes. Introducing a simple toy model in which both space and node attributes are considered, we discuss the effect of space-attribute correlations on the results of various community detection methods proposed for spatial networks in this paper and in previous studies. When space is irrelevant, our model is equivalent to the stochastic block model which has been shown to display a detectability-non detectability transition. In the regime where space dominates the link formation process, most methods can fail to recover the communities, an effect which is particularly marked when space-attributes correlations are strong. In this latter case, community detection methods which remove the spatial component of the network can miss a large part of the community structure and can lead to incorrect results.Comment: 10 pages and 7 figure
    • …
    corecore