2,469 research outputs found
Inferring topology from clustering coefficients in protein-protein interaction networks
BACKGROUND: Although protein-protein interaction networks determined with high-throughput methods are incomplete, they are commonly used to infer the topology of the complete interactome. These partial networks often show a scale-free behavior with only a few proteins having many and the majority having only a few connections. Recently, the possibility was suggested that this scale-free nature may not actually reflect the topology of the complete interactome but could also be due to the error proneness and incompleteness of large-scale experiments. RESULTS: In this paper, we investigate the effect of limited sampling on average clustering coefficients and how this can help to more confidently exclude possible topology models for the complete interactome. Both analytical and simulation results for different network topologies indicate that partial sampling alone lowers the clustering coefficient of all networks tremendously. Furthermore, we extend the original sampling model by also including spurious interactions via a preferential attachment process. Simulations of this extended model show that the effect of wrong interactions on clustering coefficients depends strongly on the skewness of the original topology and on the degree of randomness of clustering coefficients in the corresponding networks. CONCLUSION: Our findings suggest that the complete interactome is either highly skewed such as e.g. in scale-free networks or is at least highly clustered. Although the correct topology of the interactome may not be inferred beyond any reasonable doubt from the interaction networks available, a number of topologies can nevertheless be excluded with high confidence
Inferring Network Mechanisms: The Drosophila melanogaster Protein Interaction Network
Naturally occurring networks exhibit quantitative features revealing
underlying growth mechanisms. Numerous network mechanisms have recently been
proposed to reproduce specific properties such as degree distributions or
clustering coefficients. We present a method for inferring the mechanism most
accurately capturing a given network topology, exploiting discriminative tools
from machine learning. The Drosophila melanogaster protein network is
confidently and robustly (to noise and training data subsampling) classified as
a duplication-mutation-complementation network over preferential attachment,
small-world, and other duplication-mutation mechanisms. Systematic
classification, rather than statistical study of specific properties, provides
a discriminative approach to understand the design of complex networks.Comment: 19 pages, 5 figure
Large-scale inference and graph theoretical analysis of gene-regulatory networks in B. stubtilis
We present the methods and results of a two-stage modeling process that
generates candidate gene-regulatory networks of the bacterium B. subtilis from
experimentally obtained, yet mathematically underdetermined microchip array
data. By employing a computational, linear correlative procedure to generate
these networks, and by analyzing the networks from a graph theoretical
perspective, we are able to verify the biological viability of our inferred
networks, and we demonstrate that our networks' graph theoretical properties
are remarkably similar to those of other biological systems. In addition, by
comparing our inferred networks to those of a previous, noisier implementation
of the linear inference process [17], we are able to identify trends in graph
theoretical behavior that occur both in our networks as well as in their
perturbed counterparts. These commonalities in behavior at multiple levels of
complexity allow us to ascertain the level of complexity to which our process
is robust to noise.Comment: 22 pages, 4 figures, accepted for publication in Physica A (2006
Global Network Alignment
Motivation: High-throughput methods for detecting molecular interactions have lead to a plethora of biological network data with much more yet to come, stimulating the development of techniques for biological network alignment. Analogous to sequence alignment, efficient and reliable network alignment methods will improve our understanding of biological systems. Network alignment is computationally hard. Hence, devising efficient network alignment heuristics is currently one of the foremost challenges in computational biology. 

Results: We present a superior heuristic network alignment algorithm, called Matching-based GRAph ALigner (M-GRAAL), which can process and integrate any number and type of similarity measures between network nodes (e.g., proteins), including, but not limited to, any topological network similarity measure, sequence similarity, functional similarity, and structural similarity. This is efficient in resolving ties in similarity measures and in finding a combination of similarity measures yielding the largest biologically sound alignments. When used to align protein-protein interaction (PPI) networks of various species, M-GRAAL exposes the largest known functional and contiguous regions of network similarity. Hence, we use M-GRAAL’s alignments to predict functions of un-annotated proteins in yeast, human, and bacteria _C. jejuni_ and _E. coli_. Furthermore, using M-GRAAL to compare PPI networks of different herpes viruses, we reconstruct their phylogenetic relationship and our phylogenetic tree is the same as sequenced-based one
Analysis of High-Throughput Data - Protein-Protein Interactions, Protein Complexes and RNA Half-life
The development of high-throughput techniques has lead to a paradigm change in biology from the small-scale analysis of individual genes and proteins to a genome-scale analysis of biological systems. Proteins and genes can now be studied in their interaction with each other and the cooperation within multi-subunit protein complexes can be investigated. Moreover, time-dependent dynamics and regulation of these processes and associations can now be explored by monitoring mRNA changes and turnover. The in-depth analysis of these large and complex data sets would not be possible
without sophisticated algorithms for integrating different data sources, identifying interesting patterns in the data
and addressing the high variability and error rates in biological measurements. In this thesis, we developed such methods for the investigation of protein interactions and complexes and the corresponding regulatory processes.
In the first part, we analyze networks of physical protein-protein interactions measured in large-scale experiments. We show that the topology of the complete interactomes can be confidently extrapolated despite high numbers of missing and wrong interactions from only partial measurements of interaction networks. Furthermore, we find that the structure and stability of protein interaction networks is not only influenced by the degree distribution of the network but also considerably by the suppression or propagation of interactions between highly connected proteins. As analysis of network topology is generally focused on large eukaryotic networks, we developed new methods to analyze smaller networks of intraviral and virus-host interactions. By comparing interactomes of related herpesviral species, we could detect a conserved core of protein interactions and could address the low coverage of the yeast two-hybrid system. In addition, common strategies in the interaction of the viruses with the host cell were identified.
New affinity purification methods now make it possible to directly study associations of proteins in complexes. Due to experimental errors the individual protein complexes have to be predicted with computational methods from these purification results. As previously published methods relied more or less heavily on existing knowledge on complexes, we developed an unsupervised prediction algorithm which is independent from such additional data. Using this approach, high-quality protein complexes can be identified from the raw purification data alone for any species purification experiments are performed. To identify the direct, physical interactions within these predicted complexes and their subcomponent structure, we describe a new approach to extract the highest scoring subnetwork connecting the complex and interactions not explained by alternative paths of indirect interactions. In this way, important interactions within the complexes can be identified and their substructure can be resolved in a straightforward way.
To explore the regulation of proteins and complexes, we analyzed microarray measurements of mRNA abundance, de novo transcription and decay. Based on the relationship between newly transcribed, pre-existing and total RNA,
transcript half-life can be estimated for individual genes using a new microarray normalization method and a quality control can be applied. We show that precise measurements of RNA half-life can be obtained from de novo transcription which are of superior accuracy to previously published results from RNA decay. Using such precise measurements, we studied RNA half-lives in human B-cells and mouse fibroblasts to identify conserved patterns governing RNA turnover. Our results show that transcript half-lives are strongly conserved and specifically correlated to gene function. Although transcript half-life is highly similar in protein complexes and \mbox{families}, individual proteins may deviate significantly from the remaining complex subunits or family members to efficiently support the regulation of protein complexes or to create non-redundant roles of functionally similar proteins.
These results illustrate several of the many ways in which high-throughput measurements lead to a better understanding
of biological systems. By studying large-scale measure\-ments in this thesis, the structure of protein interaction networks and protein complexes could be better characterized, important interactions and conserved strategies for herpes\-viral infection could be identified and interesting insights could be gained into the regulation of important biological processes and protein complexes. This was made possible by the development of novel algorithms and analysis approaches which will also be valuable for further research on these topics
Augmented Sparse Reconstruction of Protein Signaling Networks
The problem of reconstructing and identifying intracellular protein signaling
and biochemical networks is of critical importance in biology today. We sought
to develop a mathematical approach to this problem using, as a test case, one
of the most well-studied and clinically important signaling networks in biology
today, the epidermal growth factor receptor (EGFR) driven signaling cascade.
More specifically, we suggest a method, augmented sparse reconstruction, for
the identification of links among nodes of ordinary differential equation (ODE)
networks from a small set of trajectories with different initial conditions.
Our method builds a system of representation by using a collection of integrals
of all given trajectories and by attenuating block of terms in the
representation itself. The system of representation is then augmented with
random vectors, and minimization of the 1-norm is used to find sparse
representations for the dynamical interactions of each node. Augmentation by
random vectors is crucial, since sparsity alone is not able to handle the large
error-in-variables in the representation. Augmented sparse reconstruction
allows to consider potentially very large spaces of models and it is able to
detect with high accuracy the few relevant links among nodes, even when
moderate noise is added to the measured trajectories. After showing the
performance of our method on a model of the EGFR protein network, we sketch
briefly the potential future therapeutic applications of this approach.Comment: 24 pages, 6 figure
- …