588 research outputs found

    Spectral Measures of Bipartivity in Complex Networks

    Full text link
    We introduce a quantitative measure of network bipartivity as a proportion of even to total number of closed walks in the network. Spectral graph theory is used to quantify how close to bipartite a network is and the extent to which individual nodes and edges contribute to the global network bipartivity. It is shown that the bipartivity characterizes the network structure and can be related to the efficiency of semantic or communication networks, trophic interactions in food webs, construction principles in metabolic networks, or communities in social networks.Comment: 16 pages, 1 figure, 1 tabl

    Sampling properties of random graphs: the degree distribution

    Full text link
    We discuss two sampling schemes for selecting random subnets from a network: Random sampling and connectivity dependent sampling, and investigate how the degree distribution of a node in the network is affected by the two types of sampling. Here we derive a necessary and sufficient condition that guarantees that the degree distribution of the subnet and the true network belong to the same family of probability distributions. For completely random sampling of nodes we find that this condition is fulfilled by classical random graphs; for the vast majority of networks this condition will, however, not be met. We furthermore discuss the case where the probability of sampling a node depends on the degree of a node and we find that even classical random graphs are no longer closed under this sampling regime. We conclude by relating the results to real {\it E.coli} protein interaction network data.Comment: accepted for publication in Phys.Rev.

    STITCH 4: integration of protein-chemical interactions with user data

    Get PDF
    STITCH is a database of protein-chemical interactions that integrates many sources of experimental and manually curated evidence with text-mining information and interaction predictions. Available at http://stitch.embl.de, the resulting interaction network includes 390 000 chemicals and 3.6 million proteins from 1133 organisms. Compared with the previous version, the number of high-confidence protein-chemical interactions in human has increased by 45%, to 367 000. In this version, we added features for users to upload their own data to STITCH in the form of internal identifiers, chemical structures or quantitative data. For example, a user can now upload a spreadsheet with screening hits to easily check which interactions are already known. To increase the coverage of STITCH, we expanded the text mining to include full-text articles and added a prediction method based on chemical structures. We further changed our scheme for transferring interactions between species to rely on orthology rather than protein similarity. This improves the performance within protein families, where scores are now transferred only to orthologous proteins, but not to paralogous proteins. STITCH can be accessed with a web-interface, an API and downloadable files

    Toward automatic reconstruction of a highly resolved tree of life

    Get PDF
    Contains fulltext : 51078.pdf (publisher's version ) (Closed access)We have developed an automatable procedure for reconstructing the tree of life with branch lengths comparable across all three domains. The tree has its basis in a concatenation of 31 orthologs occurring in 191 species with sequenced genomes. It revealed interdomain discrepancies in taxonomic classification. Systematic detection and subsequent exclusion of products of horizontal gene transfer increased phylogenetic resolution, allowing us to confirm accepted relationships and resolve disputed and preliminary classifications. For example, we place the phylum Acidobacteria as a sister group of delta-Proteobacteria, support a Gram-positive origin of Bacteria, and suggest a thermophilic last universal common ancestor

    Duplication-divergence model of protein interaction network

    Full text link
    We show that the protein-protein interaction networks can be surprisingly well described by a very simple evolution model of duplication and divergence. The model exhibits a remarkably rich behavior depending on a single parameter, the probability to retain a duplicated link during divergence. When this parameter is large, the network growth is not self-averaging and an average vertex degree increases algebraically. The lack of self-averaging results in a great diversity of networks grown out of the same initial condition. For small values of the link retention probability, the growth is self-averaging, the average degree increases very slowly or tends to a constant, and a degree distribution has a power-law tail.Comment: 8 pages, 13 figure

    Subgraph Centrality in Complex Networks

    Full text link
    We introduce a new centrality measure that characterizes the participation of each node in all subgraphs in a network. Smaller subgraphs are given more weight than larger ones, which makes this measure appropriate for characterizing network motifs. We show that the subgraph centrality (SC) can be obtained mathematically from the spectra of the adjacency matrix of the network. This measure is better able to discriminate the nodes of a network than alternate measures such as degree, closeness, betweenness and eigenvector centralities. We study eight real-world networks for which SC displays useful and desirable properties, such as clear ranking of nodes and scale-free characteristics. Compared with the number of links per node, the ranking introduced by SC (for the nodes in the protein interaction network of S. cereviciae) is more highly correlated with the lethality of individual proteins removed from the proteome.Comment: 29 pages, 4 figures, 2 table

    STRING: known and predicted protein–protein associations, integrated and transferred across organisms

    Get PDF
    A full description of a protein's function requires knowledge of all partner proteins with which it specifically associates. From a functional perspective, ‘association’ can mean direct physical binding, but can also mean indirect interaction such as participation in the same metabolic pathway or cellular process. Currently, information about protein association is scattered over a wide variety of resources and model organisms. STRING aims to simplify access to this information by providing a comprehensive, yet quality-controlled collection of protein–protein associations for a large number of organisms. The associations are derived from high-throughput experimental data, from the mining of databases and literature, and from predictions based on genomic context analysis. STRING integrates and ranks these associations by benchmarking them against a common reference set, and presents evidence in a consistent and intuitive web interface. Importantly, the associations are extended beyond the organism in which they were originally described, by automatic transfer to orthologous protein pairs in other organisms, where applicable. STRING currently holds 730 000 proteins in 180 fully sequenced organisms, and is available at http://string.embl.de/

    Millimeter-scale genetic gradients and community-level molecular convergence in a hypersaline microbial mat

    Get PDF
    To investigate the extent of genetic stratification in structured microbial communities, we compared the metagenomes of 10 successive layers of a phylogenetically complex hypersaline mat from Guerrero Negro, Mexico. We found pronounced millimeter-scale genetic gradients that were consistent with the physicochemical profile of the mat. Despite these gradients, all layers displayed near-identical and acid-shifted isoelectric point profiles due to a molecular convergence of amino-acid usage, indicating that hypersalinity enforces an overriding selective pressure on the mat community

    STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets

    Get PDF
    Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein-protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein-protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/
    corecore