11,203 research outputs found
Scale-Free Networks in Molecular Biology: Algorithms and Random Walks Analyses
In this research, I focus on I) the mean field analysis of algorithms for scale-free networks in molecular biology and II) the analysis of biological networks using random walks and related algorithms.
I: Many systems in nature and society are described by means of complex networks. Research indicates that these complex networks exhibit scale-free properties. Studying the organizing principles of scale-free networks has significant implications in different fields including developing better drugs, defending the internet from hackers, halting the spread of deadly epidemics, developing marketing strategies, etc. The sampling of scale-free networks in molecular biology is usually achieved by growing networks from a seed using recursive algorithms with elementary moves which include the addition and deletion of nodes and bonds. These algorithms include the Barabasi-Albert algorithm. Later algorithms, such as the Duplication-Divergence algorithm, the Sole algorithm and the iSite algorithm, were inspired by biological processes underlying the evolution of protein networks, and the networks they produce differ essentially from networks grown by the Barabasi-Albert algorithm. The mean field analysis of these algorithms is reconsidered, and extended to variant and modified implementations of the algorithms.
II: The second part of this research focuses on improving biological networks using random walks and related algorithms. I use different algorithms with the goal of finding highly connected hubs and clusters of proteins which are closely related to one another. This is done by building up protein-protein interaction networks and miRNA-gene interaction networks which are then subjected to the action of two algorithms. The first algorithm used is the random walk with resistance algorithm. As an alternative, I am proposing solving the lattice laplacian on a network as a method to discover clusters of biologically related genes. These approaches seek to find ways of solving complex pathway membership problems in protein interaction databases. The clusters obtained provide more biological insight as opposed to a process of local pairwise comparison between interacting proteins. They may also predict new members in functional pathways or clusters. Underlying these algorithms are simulated biased random walks on the network for determining membership of proteins in given clusters
Information Flow in Interaction Networks
Interaction networks, consisting of agents linked by their interactions, are
ubiquitous across many disciplines of modern science. Many methods of analysis
of interaction networks have been proposed, mainly concentrating on node degree
distribution or aiming to discover clusters of agents that are very strongly
connected between themselves. These methods are principally based on
graph-theory or machine learning.
We present a mathematically simple formalism for modelling context-specific
information propagation in interaction networks based on random walks. The
context is provided by selection of sources and destinations of information and
by use of potential functions that direct the flow towards the destinations. We
also use the concept of dissipation to model the aging of information as it
diffuses from its source.
Using examples from yeast protein-protein interaction networks and some of
the histone acetyltransferases involved in control of transcription, we
demonstrate the utility of the concepts and the mathematical constructs
introduced in this paper.Comment: 30 pages, 5 figures. This paper was published in 2007 in Journal of
Computational Biology. The version posted here does not include post
peer-review change
Communicability betweenness in complex networks
Betweenness measures provide quantitative tools to pick out fine details from the massive amount of interaction data that is available from large complex networks. They allow us to study the extent to which a node takes part when information is passed around the network. Nodes with high betweenness may be regarded as key players that have a highly active role. At one extreme, betweenness has been defined by considering information passing only through the shortest paths between pairs of nodes. At the other extreme, an alternative type of betweenness has been defined by considering all possible walks of any length. In this work, we propose a betweenness measure that lies between these two opposing viewpoints. We allow information to pass through all possible routes, but introduce a scaling so that longer walks carry less importance. This new definition shares a similar philosophy to that of communicability for pairs of nodes in a network, which was introduced by Estrada and Hatano [E. Estrada, N. Hatano, Phys. Rev. E 77 (2008) 036111]. Having defined this new communicability betweenness measure, we show that it can be characterized neatly in terms of the exponential of the adjacency matrix. We also show that this measure is closely related to a Fréchet derivative of the matrix exponential. This allows us to conclude that it also describes network sensitivity when the edges of a given node are subject to infinitesimally small perturbations. Using illustrative synthetic and real life networks, we show that the new betweenness measure behaves differently to existing versions, and in particular we show that it recovers meaningful biological information from a proteinprotein interaction network
Structural patterns in complex networks through spectral analysis
The study of some structural properties of networks is introduced from a graph spectral perspective. First, subgraph centrality of nodes is defined and used to classify essential proteins in a proteomic map. This index is then used to produce a method that allows the identification of superhomogeneous networks. At the same time this method classify non-homogeneous network into three universal classes of structure. We give examples of these classes from networks in different real-world scenarios. Finally, a communicability function is studied and showed as an alternative for defining communities in complex networks. Using this approach a community is unambiguously defined and an algorithm for its identification is proposed and exemplified in a real-world network
Diffusion Component Analysis: Unraveling Functional Topology in Biological Networks
Complex biological systems have been successfully modeled by biochemical and
genetic interaction networks, typically gathered from high-throughput (HTP)
data. These networks can be used to infer functional relationships between
genes or proteins. Using the intuition that the topological role of a gene in a
network relates to its biological function, local or diffusion based
"guilt-by-association" and graph-theoretic methods have had success in
inferring gene functions. Here we seek to improve function prediction by
integrating diffusion-based methods with a novel dimensionality reduction
technique to overcome the incomplete and noisy nature of network data. In this
paper, we introduce diffusion component analysis (DCA), a framework that plugs
in a diffusion model and learns a low-dimensional vector representation of each
node to encode the topological properties of a network. As a proof of concept,
we demonstrate DCA's substantial improvement over state-of-the-art
diffusion-based approaches in predicting protein function from molecular
interaction networks. Moreover, our DCA framework can integrate multiple
networks from heterogeneous sources, consisting of genomic information,
biochemical experiments and other resources, to even further improve function
prediction. Yet another layer of performance gain is achieved by integrating
the DCA framework with support vector machines that take our node vector
representations as features. Overall, our DCA framework provides a novel
representation of nodes in a network that can be used as a plug-in architecture
to other machine learning algorithms to decipher topological properties of and
obtain novel insights into interactomes.Comment: RECOMB 201
Herb Target Prediction Based on Representation Learning of Symptom related Heterogeneous Network.
Traditional Chinese Medicine (TCM) has received increasing attention as a complementary approach or alternative to modern medicine. However, experimental methods for identifying novel targets of TCM herbs heavily relied on the current available herb-compound-target relationships. In this work, we present an Herb-Target Interaction Network (HTINet) approach, a novel network integration pipeline for herb-target prediction mainly relying on the symptom related associations. HTINet focuses on capturing the low-dimensional feature vectors for both herbs and proteins by network embedding, which incorporate the topological properties of nodes across multi-layered heterogeneous network, and then performs supervised learning based on these low-dimensional feature representations. HTINet obtains performance improvement over a well-established random walk based herb-target prediction method. Furthermore, we have manually validated several predicted herb-target interactions from independent literatures. These results indicate that HTINet can be used to integrate heterogeneous information to predict novel herb-target interactions
- …