15,171 research outputs found

    Quantifying randomness in protein-protein interaction networks of different species: A random matrix approach

    Full text link
    We analyze protein-protein interaction networks for six different species under the framework of random matrix theory. Nearest neighbor spacing distribution of the eigenvalues of adjacency matrices of the largest connected part of these networks emulate universal Gaussian orthogonal statistics of random matrix theory. We demonstrate that spectral rigidity, which quantifies long range correlations in eigenvalues, for all protein-protein interaction networks follow random matrix prediction up to certain ranges indicating randomness in interactions. After this range, deviation from the universality evinces underlying structural features in network.Comment: 20 pages, 5 figure

    Classification in biological networks with hypergraphlet kernels

    Full text link
    Biological and cellular systems are often modeled as graphs in which vertices represent objects of interest (genes, proteins, drugs) and edges represent relational ties among these objects (binds-to, interacts-with, regulates). This approach has been highly successful owing to the theory, methodology and software that support analysis and learning on graphs. Graphs, however, often suffer from information loss when modeling physical systems due to their inability to accurately represent multiobject relationships. Hypergraphs, a generalization of graphs, provide a framework to mitigate information loss and unify disparate graph-based methodologies. In this paper, we present a hypergraph-based approach for modeling physical systems and formulate vertex classification, edge classification and link prediction problems on (hyper)graphs as instances of vertex classification on (extended, dual) hypergraphs in a semi-supervised setting. We introduce a novel kernel method on vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The method is based on exact and inexact (via hypergraph edit distances) enumeration of small simple hypergraphs, referred to as hypergraphlets, rooted at a vertex of interest. We extensively evaluate this method and show its potential use in a positive-unlabeled setting to estimate the number of missing and false positive links in protein-protein interaction networks

    An application of topological graph clustering to protein function prediction

    Full text link
    We use a semisupervised learning algorithm based on a topological data analysis approach to assign functional categories to yeast proteins using similarity graphs. This new approach to analyzing biological networks yields results that are as good as or better than state of the art existing approaches.Comment: 10 page

    Towards Gene Expression Convolutions using Gene Interaction Graphs

    Full text link
    We study the challenges of applying deep learning to gene expression data. We find experimentally that there exists non-linear signal in the data, however is it not discovered automatically given the noise and low numbers of samples used in most research. We discuss how gene interaction graphs (same pathway, protein-protein, co-expression, or research paper text association) can be used to impose a bias on a deep model similar to the spatial bias imposed by convolutions on an image. We explore the usage of Graph Convolutional Neural Networks coupled with dropout and gene embeddings to utilize the graph information. We find this approach provides an advantage for particular tasks in a low data regime but is very dependent on the quality of the graph used. We conclude that more work should be done in this direction. We design experiments that show why existing methods fail to capture signal that is present in the data when features are added which clearly isolates the problem that needs to be addressed.Comment: 4 pages +1 page references, To appear in the International Conference on Machine Learning Workshop on Computational Biology, 201

    Randomness and preserved patterns in cancer network

    Full text link
    Breast cancer has been reported to account for the maximum cases among all female cancers till date. In order to gain a deeper insight into the complexities of the disease, we analyze the breast cancer network and its normal counterpart at the proteomic level. While the short range correlations in the eigenvalues exhibiting universality provide an evidence towards the importance of random connections in the underlying networks, the long range correlations along with the localization properties reveal insightful structural patterns involving functionally important proteins. The analysis provides a benchmark for designing drugs which can target a subgraph instead of individual proteins.Comment: 21 pages, 9 figure

    node2vec: Scalable Feature Learning for Networks

    Full text link
    Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.Comment: In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 201

    Network Enhancement: a general method to denoise weighted biological networks

    Full text link
    Networks are ubiquitous in biology where they encode connectivity patterns at all scales of organization, from molecular to the biome. However, biological networks are noisy due to the limitations of measurement technology and inherent natural variation, which can hamper discovery of network patterns and dynamics. We propose Network Enhancement (NE), a method for improving the signal-to-noise ratio of undirected, weighted networks. NE uses a doubly stochastic matrix operator that induces sparsity and provides a closed-form solution that increases spectral eigengap of the input network. As a result, NE removes weak edges, enhances real connections, and leads to better downstream performance. Experiments show that NE improves gene function prediction by denoising tissue-specific interaction networks, alleviates interpretation of noisy Hi-C contact maps from the human genome, and boosts fine-grained identification accuracy of species. Our results indicate that NE is widely applicable for denoising biological networks

    Thresholding of Semantic Similarity Networks using a Spectral Graph Based Technique

    Full text link
    Semantic similarity measures (SSMs) refer to a set of algorithms used to quantify the similarity of two or more terms belonging to the same ontology. Ontology terms may be associated to concepts, for instance in computational biology gene and proteins are associated with terms of biological ontologies. Thus, SSMs may be used to quantify the similarity of genes and proteins starting from the comparison of the associated annotations. SSMs have been recently used to compare genes and proteins even on a system level scale. More recently some works have focused on the building and analysis of Semantic Similarity Networks (SSNs) i.e. weighted networks in which nodes represents genes or proteins while weighted edges represent the semantic similarity score among them. SSNs are quasi-complete networks, thus their analysis presents different challenges that should be addressed. For instance, the need for the introduction of reliable thresholds for the elimination of meaningless edges arises. Nevertheless, the use of global thresholding methods may produce the elimination of meaningful nodes, while the use of local thresholds may introduce biases. For these aims, we introduce a novel technique, based on spectral graph considerations and on a mixed global-local focus. The effectiveness of our technique is demonstrated by using markov clustering for the extraction of biological modules. We applied clustering to simplified networks demonstrating a considerable improvements with respect to the original ones

    Representation Learning on Graphs: Methods and Applications

    Full text link
    Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. Traditionally, machine learning approaches relied on user-defined heuristics to extract features encoding structural information about a graph (e.g., degree statistics or kernel functions). However, recent years have seen a surge in approaches that automatically learn to encode graph structure into low-dimensional embeddings, using techniques based on deep learning and nonlinear dimensionality reduction. Here we provide a conceptual review of key advancements in this area of representation learning on graphs, including matrix factorization-based methods, random-walk based algorithms, and graph neural networks. We review methods to embed individual nodes as well as approaches to embed entire (sub)graphs. In doing so, we develop a unified framework to describe these recent approaches, and we highlight a number of important applications and directions for future work.Comment: Published in the IEEE Data Engineering Bulletin, September 2017; version with minor correction

    Spectral properties of complex networks

    Full text link
    This review presents an account of the major works done on spectra of adjacency matrices drawn on networks and the basic understanding attained so far. We have divided the review under three sections: (a) extremal eigenvalues, (b) bulk part of the spectrum and (c) degenerate eigenvalues, based on the intrinsic properties of eigenvalues and the phenomena they capture. We have reviewed the works done for spectra of various popular model networks, such as the Erd\H{o}s-R\'enyi random networks, scale-free networks, 1-d lattice, small-world networks, and various different real-world networks. Additionally, potential applications of spectral properties for natural processes have been reviewed.Comment: 29 pages, 18 figure
    • …
    corecore