11,203 research outputs found

    Scale-Free Networks in Molecular Biology: Algorithms and Random Walks Analyses

    Get PDF
    In this research, I focus on I) the mean field analysis of algorithms for scale-free networks in molecular biology and II) the analysis of biological networks using random walks and related algorithms. I: Many systems in nature and society are described by means of complex networks. Research indicates that these complex networks exhibit scale-free properties. Studying the organizing principles of scale-free networks has significant implications in different fields including developing better drugs, defending the internet from hackers, halting the spread of deadly epidemics, developing marketing strategies, etc. The sampling of scale-free networks in molecular biology is usually achieved by growing networks from a seed using recursive algorithms with elementary moves which include the addition and deletion of nodes and bonds. These algorithms include the Barabasi-Albert algorithm. Later algorithms, such as the Duplication-Divergence algorithm, the Sole algorithm and the iSite algorithm, were inspired by biological processes underlying the evolution of protein networks, and the networks they produce differ essentially from networks grown by the Barabasi-Albert algorithm. The mean field analysis of these algorithms is reconsidered, and extended to variant and modified implementations of the algorithms. II: The second part of this research focuses on improving biological networks using random walks and related algorithms. I use different algorithms with the goal of finding highly connected hubs and clusters of proteins which are closely related to one another. This is done by building up protein-protein interaction networks and miRNA-gene interaction networks which are then subjected to the action of two algorithms. The first algorithm used is the random walk with resistance algorithm. As an alternative, I am proposing solving the lattice laplacian on a network as a method to discover clusters of biologically related genes. These approaches seek to find ways of solving complex pathway membership problems in protein interaction databases. The clusters obtained provide more biological insight as opposed to a process of local pairwise comparison between interacting proteins. They may also predict new members in functional pathways or clusters. Underlying these algorithms are simulated biased random walks on the network for determining membership of proteins in given clusters

    Information Flow in Interaction Networks

    Full text link
    Interaction networks, consisting of agents linked by their interactions, are ubiquitous across many disciplines of modern science. Many methods of analysis of interaction networks have been proposed, mainly concentrating on node degree distribution or aiming to discover clusters of agents that are very strongly connected between themselves. These methods are principally based on graph-theory or machine learning. We present a mathematically simple formalism for modelling context-specific information propagation in interaction networks based on random walks. The context is provided by selection of sources and destinations of information and by use of potential functions that direct the flow towards the destinations. We also use the concept of dissipation to model the aging of information as it diffuses from its source. Using examples from yeast protein-protein interaction networks and some of the histone acetyltransferases involved in control of transcription, we demonstrate the utility of the concepts and the mathematical constructs introduced in this paper.Comment: 30 pages, 5 figures. This paper was published in 2007 in Journal of Computational Biology. The version posted here does not include post peer-review change

    Communicability betweenness in complex networks

    Get PDF
    Betweenness measures provide quantitative tools to pick out fine details from the massive amount of interaction data that is available from large complex networks. They allow us to study the extent to which a node takes part when information is passed around the network. Nodes with high betweenness may be regarded as key players that have a highly active role. At one extreme, betweenness has been defined by considering information passing only through the shortest paths between pairs of nodes. At the other extreme, an alternative type of betweenness has been defined by considering all possible walks of any length. In this work, we propose a betweenness measure that lies between these two opposing viewpoints. We allow information to pass through all possible routes, but introduce a scaling so that longer walks carry less importance. This new definition shares a similar philosophy to that of communicability for pairs of nodes in a network, which was introduced by Estrada and Hatano [E. Estrada, N. Hatano, Phys. Rev. E 77 (2008) 036111]. Having defined this new communicability betweenness measure, we show that it can be characterized neatly in terms of the exponential of the adjacency matrix. We also show that this measure is closely related to a Fréchet derivative of the matrix exponential. This allows us to conclude that it also describes network sensitivity when the edges of a given node are subject to infinitesimally small perturbations. Using illustrative synthetic and real life networks, we show that the new betweenness measure behaves differently to existing versions, and in particular we show that it recovers meaningful biological information from a proteinprotein interaction network

    Structural patterns in complex networks through spectral analysis

    Get PDF
    The study of some structural properties of networks is introduced from a graph spectral perspective. First, subgraph centrality of nodes is defined and used to classify essential proteins in a proteomic map. This index is then used to produce a method that allows the identification of superhomogeneous networks. At the same time this method classify non-homogeneous network into three universal classes of structure. We give examples of these classes from networks in different real-world scenarios. Finally, a communicability function is studied and showed as an alternative for defining communities in complex networks. Using this approach a community is unambiguously defined and an algorithm for its identification is proposed and exemplified in a real-world network

    Diffusion Component Analysis: Unraveling Functional Topology in Biological Networks

    Full text link
    Complex biological systems have been successfully modeled by biochemical and genetic interaction networks, typically gathered from high-throughput (HTP) data. These networks can be used to infer functional relationships between genes or proteins. Using the intuition that the topological role of a gene in a network relates to its biological function, local or diffusion based "guilt-by-association" and graph-theoretic methods have had success in inferring gene functions. Here we seek to improve function prediction by integrating diffusion-based methods with a novel dimensionality reduction technique to overcome the incomplete and noisy nature of network data. In this paper, we introduce diffusion component analysis (DCA), a framework that plugs in a diffusion model and learns a low-dimensional vector representation of each node to encode the topological properties of a network. As a proof of concept, we demonstrate DCA's substantial improvement over state-of-the-art diffusion-based approaches in predicting protein function from molecular interaction networks. Moreover, our DCA framework can integrate multiple networks from heterogeneous sources, consisting of genomic information, biochemical experiments and other resources, to even further improve function prediction. Yet another layer of performance gain is achieved by integrating the DCA framework with support vector machines that take our node vector representations as features. Overall, our DCA framework provides a novel representation of nodes in a network that can be used as a plug-in architecture to other machine learning algorithms to decipher topological properties of and obtain novel insights into interactomes.Comment: RECOMB 201

    Herb Target Prediction Based on Representation Learning of Symptom related Heterogeneous Network.

    Get PDF
    Traditional Chinese Medicine (TCM) has received increasing attention as a complementary approach or alternative to modern medicine. However, experimental methods for identifying novel targets of TCM herbs heavily relied on the current available herb-compound-target relationships. In this work, we present an Herb-Target Interaction Network (HTINet) approach, a novel network integration pipeline for herb-target prediction mainly relying on the symptom related associations. HTINet focuses on capturing the low-dimensional feature vectors for both herbs and proteins by network embedding, which incorporate the topological properties of nodes across multi-layered heterogeneous network, and then performs supervised learning based on these low-dimensional feature representations. HTINet obtains performance improvement over a well-established random walk based herb-target prediction method. Furthermore, we have manually validated several predicted herb-target interactions from independent literatures. These results indicate that HTINet can be used to integrate heterogeneous information to predict novel herb-target interactions
    corecore