Search CORE

64,720 research outputs found

Algorithms for the analysis of protein interaction networks

Author: Singh Rohit, Ph.D. Massachusetts Institute of Technology
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2012
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 107-117).In the decade since the human genome project, a major research trend in biology has been towards understanding the cell as a system. This interest has stemmed partly from a deeper appreciation of how important it is to understand the emergent properties of cellular systems (e.g., they seem to be the key to understanding diseases like cancer). It has also been enabled by new high-throughput techniques that have allowed us to collect new types of data at the whole-genome scale. We focus on one sub-domain of systems biology: the understanding of protein interactions. Such understanding is valuable: interactions between proteins are fundamental to many cellular processes. Over the last decade, high-throughput experimental techniques have allowed us to collect a large amount of protein-protein interaction (PPI) data for many species. A popular abstraction for representing this data is the protein interaction network: each node of the network represents a protein and an edge between two nodes represents a physical interaction between the two corresponding proteins. This abstraction has proven to be a powerful tool for understanding the systems aspects of protein interaction. We present some algorithms for the augmentation, cleanup and analysis of such protein interaction networks: 1. In many species, the coverage of known PPI data remains partial. Given two protein sequences, we describe an algorithm to predict if two proteins physically interact, using logistic regression and insights from structural biology. We also describe how our predictions may be further improved by combining with functional-genomic data. 2. We study systematic false positives in a popular experimental protocol, the Yeast 2-Hybrid method. Here, some "promiscuous" proteins may lead to many false positives. We describe a Bayesian approach to modeling and adjusting for this error. 3. Comparative analysis of PPI networks across species can provide valuable insights. We describe IsoRank, an algorithm for global network alignment of multiple PPI networks. The algorithm first constructs an eigenvalue problem that encapsulates the network and sequence similarity constraints. The solution of the problem describes a k-partite graph that is further processed to find the alignment. 4. For a given signaling network, we describe an algorithm that combines RNA-interference data with PPI data to produce hypotheses about the structure of the signaling network. Our algorithm constructs a multi-commodity flow problem that expresses the constraints described by the data and finds a sparse solution to it.by Rohit Singh.Ph.D

DSpace@MIT

PCSF: An R-package for network-based interpretation of high-throughput data

Author: Akhmedov Murodzhon
Bertoni Francesco
Chong Renan Escalante
Fraenkel Ernest
Kedaigle Amanda
Kwee Ivo
Montemanni Roberto
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2017
Field of study

With the recent technological developments a vast amount of high-throughput data has been profiled to understand the mechanism of complex diseases. The current bioinformatics challenge is to interpret the data and underlying biology, where efficient algorithms for analyzing heterogeneous high-throughput data using biological networks are becoming increasingly valuable. In this paper, we propose a software package based on the Prize-collecting Steiner Forest graph optimization approach. The PCSF package performs fast and user-friendly network analysis of high-throughput data by mapping the data onto a biological networks such as protein-protein interaction, gene-gene interaction or any other correlation or coexpression based networks. Using the interaction networks as a template, it determines high-confidence subnetworks relevant to the data, which potentially leads to predictions of functional units. It also interactively visualizes the resulting subnetwork with functional enrichment analysis

DSpace@MIT

Directory of Open Access Journals

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

FigShare

Community Detection by $L_0$ -penalized Graph Laplacian

Author: Chen Chong
Lin Nan
Xi Ruibin
Publication venue
Publication date: 30/06/2017
Field of study

Community detection in network analysis aims at partitioning nodes in a network into

K

disjoint communities. Most currently available algorithms assume that

K

is known, but choosing a correct

K

is generally very difficult for real networks. In addition, many real networks contain outlier nodes not belonging to any community, but currently very few algorithm can handle networks with outliers. In this paper, we propose a novel model free tightness criterion and an efficient algorithm to maximize this criterion for community detection. This tightness criterion is closely related with the graph Laplacian with

L_0

penalty. Unlike most community detection methods, our method does not require a known

K

and can properly detect communities in networks with outliers. Both theoretical and numerical properties of the method are analyzed. The theoretical result guarantees that, under the degree corrected stochastic block model, even for networks with outliers, the maximizer of the tightness criterion can extract communities with small misclassification rates even when the number of communities grows to infinity as the network size grows. Simulation study shows that the proposed method can recover true communities more accurately than other methods. Applications to a college football data and a yeast protein-protein interaction data also reveal that the proposed method performs significantly better.Comment: 40 pages, 15 Postscript figure

arXiv.org e-Print Archive

Crossref

DPPIN: A Biological Dataset of Dynamic Protein-Protein Interaction Networks

Author: Fu Dongqi
He Jingrui
Publication venue
Publication date: 05/07/2021
Field of study

Nowadays, many network representation learning algorithms and downstream network mining tasks have already paid attention to dynamic networks or temporal networks, which are more suitable for real-world complex scenarios by modeling evolving patterns and temporal dependencies between node interactions. Moreover, representing and mining temporal networks have a wide range of applications, such as fraud detection, social network analysis, and drug discovery. To contribute to the network representation learning and network mining research community, in this paper, we generate a new biological dataset of dynamic protein-protein interaction networks (i.e., DPPIN), which consists of twelve dynamic protein-level interaction networks of yeast cells at different scales. We first introduce the generation process of DPPIN. To demonstrate the value of our published dataset DPPIN, we then list the potential applications that would be benefited. Furthermore, we design dynamic local clustering, dynamic spectral clustering, dynamic subgraph matching, dynamic node classification, and dynamic graph classification experiments, where DPPIN indicates future research opportunities for some tasks by presenting challenges on state-of-the-art baseline algorithms. Finally, we identify future directions for improving this dataset utility and welcome inputs from the community. All resources of this work are deployed and publicly available at https://github.com/DongqiFu/DPPIN

arXiv.org e-Print Archive

Semantic integration to identify overlapping functional modules in protein interaction networks

Author: A Barrat
A Tanay
A-C Gavin
A-L Barabási
AD King
Aidong Zhang
AW Rives
C von Mering
CA Ball
CM Deane
D Bu
E Ravasz
G Palla
H Jeong
HW Mewes
L Salwinski
LH Hartwell
M Girvan
MP Samanta
Murali Ramanathan
P Pei
P Resnik
P Uetz
R Dunn
S Tornow
T Ideker
T Ito
The Gene Ontology Consortium
TR Hvidsten
V Arnau
V Spirin
Woochang Hwang
Y Ho
Y-R Cho
Young-Rae Cho
Z Fang
Z Lubovac
Publication venue: BioMed Central
Publication date: 01/07/2007
Field of study

Abstract Background The systematic analysis of protein-protein interactions can enable a better understanding of cellular organization, processes and functions. Functional modules can be identified from the protein interaction networks derived from experimental data sets. However, these analyses are challenging because of the presence of unreliable interactions and the complex connectivity of the network. The integration of protein-protein interactions with the data from other sources can be leveraged for improving the effectiveness of functional module detection algorithms. Results We have developed novel metrics, called semantic similarity and semantic interactivity, which use Gene Ontology (GO) annotations to measure the reliability of protein-protein interactions. The protein interaction networks can be converted into a weighted graph representation by assigning the reliability values to each interaction as a weight. We presented a flow-based modularization algorithm to efficiently identify overlapping modules in the weighted interaction networks. The experimental results show that the semantic similarity and semantic interactivity of interacting pairs were positively correlated with functional co-occurrence. The effectiveness of the algorithm for identifying modules was evaluated using functional categories from the MIPS database. We demonstrated that our algorithm had higher accuracy compared to other competing approaches. Conclusion The integration of protein interaction networks with GO annotation data and the capability of detecting overlapping modules substantially improve the accuracy of module identification.</p

Crossref

Directory of Open Access Journals

PubMed Central

Scale-Free Networks in Molecular Biology: Algorithms and Random Walks Analyses

Author: Konini Silva
Publication venue
Publication date: 01/03/2018
Field of study

In this research, I focus on I) the mean field analysis of algorithms for scale-free networks in molecular biology and II) the analysis of biological networks using random walks and related algorithms. I: Many systems in nature and society are described by means of complex networks. Research indicates that these complex networks exhibit scale-free properties. Studying the organizing principles of scale-free networks has significant implications in different fields including developing better drugs, defending the internet from hackers, halting the spread of deadly epidemics, developing marketing strategies, etc. The sampling of scale-free networks in molecular biology is usually achieved by growing networks from a seed using recursive algorithms with elementary moves which include the addition and deletion of nodes and bonds. These algorithms include the Barabasi-Albert algorithm. Later algorithms, such as the Duplication-Divergence algorithm, the Sole algorithm and the iSite algorithm, were inspired by biological processes underlying the evolution of protein networks, and the networks they produce differ essentially from networks grown by the Barabasi-Albert algorithm. The mean field analysis of these algorithms is reconsidered, and extended to variant and modified implementations of the algorithms. II: The second part of this research focuses on improving biological networks using random walks and related algorithms. I use different algorithms with the goal of finding highly connected hubs and clusters of proteins which are closely related to one another. This is done by building up protein-protein interaction networks and miRNA-gene interaction networks which are then subjected to the action of two algorithms. The first algorithm used is the random walk with resistance algorithm. As an alternative, I am proposing solving the lattice laplacian on a network as a method to discover clusters of biologically related genes. These approaches seek to find ways of solving complex pathway membership problems in protein interaction databases. The clusters obtained provide more biological insight as opposed to a process of local pairwise comparison between interacting proteins. They may also predict new members in functional pathways or clusters. Underlying these algorithms are simulated biased random walks on the network for determining membership of proteins in given clusters

YorkSpace

Automated Analysis of Fluorescence Microscopy Images to Identify Protein-Protein Interactions

Author: Doktycz M. J.
Morrell-Falvey J. L.
Qi H.
Venkatraman S.
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2006
Field of study

The identification of protein interactions is important for elucidating biological networks. One obstacle in comprehensive interaction studies is the analyses of large datasets, particularly those containing images. Development of an automated system to analyze an image-based protein interaction dataset is needed. Such an analysis system is described here, to automatically extract features from fluorescence microscopy images obtained from a bacterial protein interaction assay. These features are used to relay quantitative values that aid in the automated scoring of positive interactions. Experimental observations indicate that identifying at least 50% positive cells in an image is sufficient to detect a protein interaction. Based on this criterion, the automated system presents 100% accuracy in detecting positive interactions for a dataset of 16 images. Algorithms were implemented using MATLAB and the software developed is available on request from the authors

Crossref

Directory of Open Access Journals

PubMed Central