406 research outputs found
Global Network Alignment
Motivation: High-throughput methods for detecting molecular interactions have lead to a plethora of biological network data with much more yet to come, stimulating the development of techniques for biological network alignment. Analogous to sequence alignment, efficient and reliable network alignment methods will improve our understanding of biological systems. Network alignment is computationally hard. Hence, devising efficient network alignment heuristics is currently one of the foremost challenges in computational biology. 

Results: We present a superior heuristic network alignment algorithm, called Matching-based GRAph ALigner (M-GRAAL), which can process and integrate any number and type of similarity measures between network nodes (e.g., proteins), including, but not limited to, any topological network similarity measure, sequence similarity, functional similarity, and structural similarity. This is efficient in resolving ties in similarity measures and in finding a combination of similarity measures yielding the largest biologically sound alignments. When used to align protein-protein interaction (PPI) networks of various species, M-GRAAL exposes the largest known functional and contiguous regions of network similarity. Hence, we use M-GRAAL’s alignments to predict functions of un-annotated proteins in yeast, human, and bacteria _C. jejuni_ and _E. coli_. Furthermore, using M-GRAAL to compare PPI networks of different herpes viruses, we reconstruct their phylogenetic relationship and our phylogenetic tree is the same as sequenced-based one
A global genetic interaction network maps a wiring diagram of cellular function
We generated a global genetic interaction network for Saccharomyces cerevisiae, constructing more than 23 million double mutants, identifying about 550,000 negative and about 350,000 positive genetic interactions. This comprehensive network maps genetic interactions for essential gene pairs, highlighting essential genes as densely connected hubs. Genetic interaction profiles enabled assembly of a hierarchical model of cell function, including modules corresponding to protein complexes and pathways, biological processes, and cellular compartments. Negative interactions connected functionally related genes, mapped core bioprocesses, and identified pleiotropic genes, whereas positive interactions often mapped general regulatory connections among gene pairs, rather than shared functionality. The global network illustrates how coherent sets of genetic interactions connect protein complex and pathway modules to map a functional wiring diagram of the cell.
INTRODUCTION: Genetic interactions occur when mutations in two or more genes combine to generate an unexpected phenotype. An extreme negative or synthetic lethal genetic interaction occurs when two mutations, neither lethal individually, combine to cause cell death. Conversely, positive genetic interactions occur when two mutations produce a phenotype that is less severe than expected. Genetic interactions identify functional relationships between genes and can be harnessed for biological discovery and therapeutic target identification. They may also explain a considerable component of the undiscovered genetics associated with human diseases. Here, we describe construction and analysis of a comprehensive genetic interaction network for a eukaryotic cell.
RATIONALE: Genome sequencing projects are providing an unprecedented view of genetic variation. However, our ability to interpret genetic information to predict inherited phenotypes remains limited, in large part due to the extensive buffering of genomes, making most individual eukaryotic genes dispensable for life. To explore the extent to which genetic interactions reveal cellular function and contribute to complex phenotypes, and to discover the general principles of genetic networks, we used automated yeast genetics to construct a global genetic interaction network.
RESULTS: We tested most of the ~6000 genes in the yeast Saccharomyces cerevisiae for all possible pairwise genetic interactions, identifying nearly 1 million interactions, including ~550,000 negative and ~350,000 positive interactions, spanning ~90% of all yeast genes. Essential genes were network hubs, displaying five times as many interactions as nonessential genes. The set of genetic interactions or the genetic interaction profile for a gene provides a quantitative measure of function, and a global network based on genetic interaction profile similarity revealed a hierarchy of modules reflecting the functional architecture of a cell. Negative interactions connected functionally related genes, mapped core bioprocesses, and identified pleiotropic genes, whereas positive interactions often mapped general regulatory connections associated with defects in cell cycle progression or cellular proteostasis. Importantly, the global network illustrates how coherent sets of negative or positive genetic interactions connect protein complex and pathways to map a functional wiring diagram of the cell.
CONCLUSION: A global genetic interaction network highlights the functional organization of a cell and provides a resource for predicting gene and pathway function. This network emphasizes the prevalence of genetic interactions and their potential to compound phenotypes associated with single mutations. Negative genetic interactions tend to connect functionally related genes and thus may be predicted using alternative functional information. Although less functionally informative, positive interactions may provide insights into general mechanisms of genetic suppression or resiliency. We anticipate that the ordered topology of the global genetic network, in which genetic interactions connect coherently within and between protein complexes and pathways, may be exploited to decipher genotype-to-phenotype relationships
An integrative approach to modeling biological networks
Since proteins carry out biological processes by interacting with other
proteins, analyzing the structure of protein-protein interaction (PPI) networks
could explain complex biological mechanisms, evolution, and disease. Similarly,
studying protein structure networks, residue interaction graphs (RIGs), might
provide insights into protein folding, stability, and function. The first step
towards understanding these networks is finding an adequate network model that
closely replicates their structure. Evaluating the fit of a model to the data
requires comparing the model with real-world networks. Since network
comparisons are computationally infeasible, they rely on heuristics, or
"network properties." We show that it is difficult to assess the reliability of
the fit of a model with any individual network property. Thus, our approach
integrates a variety of network properties and further combines these with a
series of probabilistic methods to predict an appropriate network model for
biological networks. We find geometric random graphs, that model spatial
relationships between objects, to be the best-fitting model for RIGs. This
validates the correctness of our method, since RIGs have previously been shown
to be geometric. We apply our approach to noisy PPI networks and demonstrate
that their structure is also consistent with geometric random graphs.Comment: 10 pages, 3 tables, 4 figure
Network analytics in the age of big data
We live in a complex world of interconnected entities. In all areas of human endeavor, from biology to medicine, economics, and climate science, we are flooded with large-scale data sets. These data sets describe intricate real-world systems from different and complementary viewpoints, with entities being modeled as nodes and their connections as edges, comprising large networks. These networked data are a new and rich source of domain-specific information, but that information is currently largely hidden within the complicated wiring patterns. Deciphering these patterns is paramount, because computational analyses of large networks are often intractable, so that many questions we ask about the world cannot be answered exactly, even with unlimited computer power and time (1). Hence, the only hope is to answer these questions approximately (that is, heuristically) and prove how far the approximate answer is from the exact, unknown one, in the worst case. On page 163 of this issue, Benson et al. (2) take an important step in that direction by providing a scalable heuristic framework for grouping entities based on their wiring patterns and using the discovered patterns for revealing the higher-order organizational principles of several real-world networked systems
Unified Alignment of Protein-Protein Interaction Networks
Paralleling the increasing availability of protein-protein interaction (PPI) network data, several network alignment methods have been proposed. Network alignments have been used to uncover functionally conserved network parts and to transfer annotations. However, due to the computational intractability of the network alignment problem, aligners are heuristics providing divergent solutions and no consensus exists on a gold standard, or which scoring scheme should be used to evaluate them. We comprehensively evaluate the alignment scoring schemes and global network aligners on large scale PPI data and observe that three methods, HUBALIGN, L-GRAAL and NATALIE, regularly produce the most topologically and biologically coherent alignments. We study the collective behaviour of network aligners and observe that PPI networks are almost entirely aligned with a handful of aligners that we unify into a new tool, Ulign. Ulign enables complete alignment of two networks, which traditional global and local aligners fail to do. Also, multiple mappings of Ulign define biologically relevant soft clusterings of proteins in PPI networks, which may be used for refining the transfer of annotations across networks. Hence, PPI networks are already well investigated by current aligners, so to gain additional biological insights, a paradigm shift is needed. We propose such a shift come from aligning all available data types collectively rather than any particular data type in isolation from others
Biological network comparison using graphlet degree distribution
Analogous to biological sequence comparison, comparing cellular networks is
an important problem that could provide insight into biological understanding
and therapeutics. For technical reasons, comparing large networks is
computationally infeasible, and thus heuristics such as the degree distribution
have been sought. It is easy to demonstrate that two networks are different by
simply showing a short list of properties in which they differ. It is much
harder to show that two networks are similar, as it requires demonstrating
their similarity in all of their exponentially many properties. Clearly, it is
computationally prohibitive to analyze all network properties, but the larger
the number of constraints we impose in determining network similarity, the more
likely it is that the networks will truly be similar.
We introduce a new systematic measure of a network's local structure that
imposes a large number of similarity constraints on networks being compared. In
particular, we generalize the degree distribution, which measures the number of
nodes 'touching' k edges, into distributions measuring the number of nodes
'touching' k graphlets, where graphlets are small connected non-isomorphic
subgraphs of a large network. Our new measure of network local structure
consists of 73 graphlet degree distributions (GDDs) of graphlets with 2-5
nodes, but it is easily extendible to a greater number of constraints (i.e.
graphlets). Furthermore, we show a way to combine the 73 GDDs into a network
'agreement' measure. Based on this new network agreement measure, we show that
almost all of the 14 eukaryotic PPI networks, including human, are better
modeled by geometric random graphs than by Erdos-Reny, random scale-free, or
Barabasi-Albert scale-free networks.Comment: Proceedings of the 2006 European Conference on Computational Biology,
ECCB'06, Eilat, Israel, January 21-24, 200
Graphlet-based Characterization of Directed Networks
We are flooded with large-scale, dynamic, directed, networked data. Analyses requiring exact comparisons between networks are computationally intractable, so new methodologies are sought. To analyse directed networks, we extend graphlets (small induced sub-graphs) and their degrees to directed data. Using these directed graphlets, we generalise state-of-the-art network distance measures (RGF, GDDA and GCD) to directed networks and show their superiority for comparing directed networks. Also, we extend the canonical correlation analysis framework that enables uncovering the relationships between the wiring
patterns around nodes in a directed network and their expert annotations. On directed World Trade Networks (WTNs), our methodology allows uncovering the core-broker-periphery structure of the WTN, predicting the economic attributes of a country, such as its gross domestic product, from its wiring patterns in the WTN for up-to ten years in the future. It does so by enabling us to track the dynamics of a country’s positioning in the WTN over years. On directed metabolic networks, our framework
yields insights into preservation of enzyme function from the network wiring patterns rather than from sequence data. Overall, our methodology enables advanced analyses of directed networked data from any area of science, allowing domain-specific interpretation of a directed network’s topology
- …
