Search CORE

15,354 research outputs found

Universality in protein residue networks

Author: Estrada Ernesto
New Professor's Fund - University of Strathclyde (Funder)
Publication venue: 'Elsevier BV'
Publication date: 03/03/2010
Field of study

Residue networks representing 595 nonhomologous proteins are studied. These networks exhibit universal topological characteristics as they belong to the topological class of modular networks formed by several highly interconnected clusters separated by topological cavities. There are some networks which tend to deviate from this universality. These networks represent small-size proteins having less than 200 residues. We explain such differences in terms of the domain structure of these proteins. On the other hand, we find that the topological cavities characterizing proteins residue networks match very well with protein binding sites. We then investigate the effect of the cutoff value used in building the residue network. For small cutoff values, less than 5Å, the cavities found are very large corresponding almost to the whole protein surface. On the contrary, for large cutoff value, more than 10.0 Å, only very large cavities are detected and the networks look very homogeneous. These findings are useful for practical purposes as well as for identifying "protein-like" complex networks. Finally, we show that the main topological class of residue networks is not reproduced by random networks growing according to Erdös-Rényi model or the preferential attachment method of Barabási-Albert. However, the Watts-Strogatz (WS) model reproduces very well the topological class as well as other topological properties of residue network. We propose here a more biologically appealing modification of the WS model to describe residue networks

University of Strathclyde Institutional Repository

ProtNN: Fast and Accurate Nearest Neighbor Protein Function Prediction based on Graph Embedding in Structural and Topological Space

Author: Dhifli Wajdi
Diallo Abdoulaye Baniré
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/01/2016
Field of study

Studying the function of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the determination of the function of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the determination of protein functions in living cells. In this paper, we propose ProtNN, a novel approach for protein function prediction. Given an unannotated protein structure and a set of annotated proteins, ProtNN finds the nearest neighbor annotated structures based on protein-graph pairwise similarities. Given a query protein, ProtNN finds the nearest neighbor reference proteins based on a graph representation model and a pairwise similarity between vector embedding of both query and reference protein-graphs in structural and topological spaces. ProtNN assigns to the query protein the function with the highest number of votes across the set of k nearest neighbor reference proteins, where k is a user-defined parameter. Experimental evaluation demonstrates that ProtNN is able to accurately classify several datasets in an extremely fast runtime compared to state-of-the-art approaches. We further show that ProtNN is able to scale up to a whole PDB dataset in a single-process mode with no parallelization, with a gain of thousands order of magnitude of runtime compared to state-of-the-art approaches

arXiv.org e-Print Archive

Springer - Publisher Connector

Identifying interactions in the time and frequency domains in local and global networks : a Granger causality approach

Author: Feng Jianfeng
Guo Shuixia
Ladroue Christophe
Zou Cunlu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background Reverse-engineering approaches such as Bayesian network inference, ordinary differential equations (ODEs) and information theory are widely applied to deriving causal relationships among different elements such as genes, proteins, metabolites, neurons, brain areas and so on, based upon multi-dimensional spatial and temporal data. There are several well-established reverse-engineering approaches to explore causal relationships in a dynamic network, such as ordinary differential equations (ODE), Bayesian networks, information theory and Granger Causality. Results Here we focused on Granger causality both in the time and frequency domain and in local and global networks, and applied our approach to experimental data (genes and proteins). For a small gene network, Granger causality outperformed all the other three approaches mentioned above. A global protein network of 812 proteins was reconstructed, using a novel approach. The obtained results fitted well with known experimental findings and predicted many experimentally testable results. In addition to interactions in the time domain, interactions in the frequency domain were also recovered. Conclusions The results on the proteomic data and gene data confirm that Granger causality is a simple and accurate approach to recover the network structure. Our approach is general and can be easily applied to other types of temporal data

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

Transforming Graph Representations for Statistical Relational Learning

Author: Aha David W.
McDowell Luke K.
Neville Jennifer
Rossi Ryan A.
Publication venue
Publication date: 01/01/2012
Field of study

Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed

arXiv.org e-Print Archive

CiteSeerX

Algorithmic and Statistical Perspectives on Large-Scale Data Analysis

Author: Mahoney Michael W.
Publication venue
Publication date: 08/10/2010
Field of study

In recent years, ideas from statistics and scientific computing have begun to interact in increasingly sophisticated and fruitful ways with ideas from computer science and the theory of algorithms to aid in the development of improved worst-case algorithms that are useful for large-scale scientific and Internet data analysis problems. In this chapter, I will describe two recent examples---one having to do with selecting good columns or features from a (DNA Single Nucleotide Polymorphism) data matrix, and the other having to do with selecting good clusters or communities from a data graph (representing a social or information network)---that drew on ideas from both areas and that may serve as a model for exploiting complementary algorithmic and statistical perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors, "Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201

arXiv.org e-Print Archive

CiteSeerX

Discovering bipartite substructure in directed networks

Author: Higham Desmond J.
Taylor Alan
Vass J. Keith
Publication venue: 'Wiley'
Publication date: 01/01/2011
Field of study

Bipartivity is an important network concept that can be applied to nodes, edges and communities. Here we focus on directed networks and look for subnetworks made up of two distinct groups of nodes, connected by “one-way” links. We show that a spectral approach can be used to find hidden substructure of this form. Theoretical support is given for the idealised case where there is limited overlap between subnetworks. Numerical experiments show that the approach is robust to spurious and missing edges. A key application of this work is in the analysis of high-throughput gene expression data, and we give an example where a biologically meaningful directed bipartite subnetwork is found from a cancer microarray dataset

University of Strathclyde Institutional Repository

Edinburgh Research Explorer

Beyond element-wise interactions: identifying complex interactions in biological processes

Author: A Kahvejian
AJ Tate
B Gourévitch
C Granger
C Zou
Christophe Ladroue
CJ Needham
CWJ Granger
H Parkinson
HW Mewes
J Geweke
J Pearl
J Peirce
J Shendure
J Wu
J Yu
JF Geweke
Jianfeng Feng
K Friston
K Sachs
Keith Kendrick
L Royer
M Ding
M Eichler
M Fletcher
MC Teixeira
N Wiener
O David
PT Spellman
R Aebersold
RA Horn
RS Wang
S Guo
S Klamt
S Mukherjee
Shuixia Guo
SM Kosslyn
T Barrett
Vladimir Brezina
Y Chen
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 22/09/2009
Field of study

Background: Biological processes typically involve the interactions of a number of elements (genes, cells) acting on each others. Such processes are often modelled as networks whose nodes are the elements in question and edges pairwise relations between them (transcription, inhibition). But more often than not, elements actually work cooperatively or competitively to achieve a task. Or an element can act on the interaction between two others, as in the case of an enzyme controlling a reaction rate. We call “complex” these types of interaction and propose ways to identify them from time-series observations. Methodology: We use Granger Causality, a measure of the interaction between two signals, to characterize the influence of an enzyme on a reaction rate. We extend its traditional formulation to the case of multi-dimensional signals in order to capture group interactions, and not only element interactions. Our method is extensively tested on simulated data and applied to three biological datasets: microarray data of the Saccharomyces cerevisiae yeast, local field potential recordings of two brain areas and a metabolic reaction. Conclusions: Our results demonstrate that complex Granger causality can reveal new types of relation between signals and is particularly suited to biological data. Our approach raises some fundamental issues of the systems biology approach since finding all complex causalities (interactions) is an NP hard problem

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

Defining and Evaluating Network Communities based on Ground-truth

Author: Leskovec Jure
Yang Jaewon
Publication venue
Publication date: 06/11/2012
Field of study

Nodes in real-world networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven to be a challenging task mainly due to a plethora of definitions of a community, intractability of algorithms, issues with evaluation and the lack of a reliable gold-standard ground-truth. In this paper we study a set of 230 large real-world social, collaboration and information networks where nodes explicitly state their group memberships. For example, in social networks nodes explicitly join various interest based social groups. We use such groups to define a reliable and robust notion of ground-truth communities. We then propose a methodology which allows us to compare and quantitatively evaluate how different structural definitions of network communities correspond to ground-truth communities. We choose 13 commonly used structural definitions of network communities and examine their sensitivity, robustness and performance in identifying the ground-truth. We show that the 13 structural definitions are heavily correlated and naturally group into four classes. We find that two of these definitions, Conductance and Triad-participation-ratio, consistently give the best performance in identifying ground-truth communities. We also investigate a task of detecting communities given a single seed node. We extend the local spectral clustering algorithm into a heuristic parameter-free community detection method that easily scales to networks with more than hundred million nodes. The proposed method achieves 30% relative improvement over current local clustering methods.Comment: Proceedings of 2012 IEEE International Conference on Data Mining (ICDM), 201

arXiv.org e-Print Archive

CiteSeerX

Graph Theory and Networks in Biology

Author: Mason Oliver
Verwoerd Mark
Publication venue
Publication date: 06/04/2006
Field of study

In this paper, we present a survey of the use of graph theoretical techniques in Biology. In particular, we discuss recent work on identifying and modelling the structure of bio-molecular networks, as well as the application of centrality measures to interaction networks and research on the hierarchical structure of such networks and network motifs. Work on the link between structural network properties and dynamics is also described, with emphasis on synchronization and disease propagation.Comment: 52 pages, 5 figures, Survey Pape

arXiv.org e-Print Archive

CiteSeerX

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive