16,868 research outputs found
Detecting highly overlapping community structure by greedy clique expansion
In complex networks it is common for each node to belong to several
communities, implying a highly overlapping community structure. Recent advances
in benchmarking indicate that existing community assignment algorithms that are
capable of detecting overlapping communities perform well only when the extent
of community overlap is kept to modest levels. To overcome this limitation, we
introduce a new community assignment algorithm called Greedy Clique Expansion
(GCE). The algorithm identifies distinct cliques as seeds and expands these
seeds by greedily optimizing a local fitness function. We perform extensive
benchmarks on synthetic data to demonstrate that GCE's good performance is
robust across diverse graph topologies. Significantly, GCE is the only
algorithm to perform well on these synthetic graphs, in which every node
belongs to multiple communities. Furthermore, when put to the task of
identifying functional modules in protein interaction data, and college dorm
assignments in Facebook friendship data, we find that GCE performs
competitively.Comment: 10 pages, 7 Figures. Implementation source and binaries available at
http://sites.google.com/site/greedycliqueexpansion
Analysis of Genetic Interaction Maps Reveals Functional Pleiotropy
Epistatic or genetic interactions, representing the effects of mutations on the phenotypes caused by other mutations, can be very helpful for uncovering functional relationships between genes. Recently, the Epistasis Miniarray Profile (E-MAP) method has emerged as a powerful approach for identifying such interactions systematically. As part of this approach, hierarchical clustering is used to partition genes into groups on the basis of the similarity between their global interaction profiles. Here we present an original biclustering algorithm for identifying groups of functionally related genes from E-MAP data in a manner that allows individual genes to be assigned to more than one functional group. This enables investigation of the pleiotropic nature of gene function, a goal that cannot be achieved with hierarchical clustering. The performance of our algorithm is illustrated by applying it to two E-MAP datasets and an E-MAP-like in silico dataset for the yeast S. cerevisiae. In addition to identifying the majority of the functional modules reported in these studies, our algorithm uncovers many recently documented and novel multi-functional relationships between genes and gene groups
LightDock: a new multi-scale approach to protein–protein docking
Computational prediction of protein–protein complex structure by docking can provide structural and mechanistic insights for protein interactions of biomedical interest. However, current methods struggle with difficult cases, such as those involving flexible proteins, low-affinity complexes or transient interactions. A major challenge is how to efficiently sample the structural and energetic landscape of the association at different resolution levels, given that each scoring function is often highly coupled to a specific type of search method. Thus, new methodologies capable of accommodating multi-scale conformational flexibility and scoring are strongly needed.
We describe here a new multi-scale protein–protein docking methodology, LightDock, capable of accommodating conformational flexibility and a variety of scoring functions at different resolution levels. Implicit use of normal modes during the search and atomic/coarse-grained combined scoring functions yielded improved predictive results with respect to state-of-the-art rigid-body docking, especially in flexible cases.B.J-G was supported by a FPI fellowship from the Spanish Ministry of Economy and
Competitiveness. This work was supported by I+D+I Research Project grants BIO2013-48213-R and BIO2016-79930-R from the Spanish Ministry of Economy
and Competitiveness. This work is partially supported by the European Union H2020
program through HiPEAC (GA 687698), by the Spanish Government through Programa
Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and
Technology (TIN2015-65316-P) and the Departament d’Innovació, Universitats i
Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programaciói Entorns d’Execució Paral·lels (2014-SGR-1051).Peer ReviewedPostprint (author's final draft
Domain-mediated interactions for protein subfamily identification
Within a protein family, proteins with the same domain often exhibit different cellular functions, despite the shared evolutionary history and molecular function of the domain. We hypothesized that domain-mediated interactions (DMIs) may categorize a protein family into subfamilies because the diversified functions of a single domain often depend on interacting partners of domains. Here we systematically identified DMI subfamilies, in which proteins share domains with DMI partners, as well as with various functional and physical interaction networks in individual species. In humans, DMI subfamily members are associated with similar diseases, including cancers, and are frequently co-associated with the same diseases. DMI information relates to the functional and evolutionary subdivisions of human kinases. In yeast, DMI subfamilies contain proteins with similar phenotypic outcomes from specific chemical treatments. Therefore, the systematic investigation here provides insights into the diverse functions of subfamilies derived from a protein family with a link-centric approach and suggests a useful resource for annotating the functions and phenotypic outcomes of proteins.11Ysciescopu
Evolutionary Dynamics in a Simple Model of Self-Assembly
We investigate the evolutionary dynamics of an idealised model for the robust
self-assembly of two-dimensional structures called polyominoes. The model
includes rules that encode interactions between sets of square tiles that drive
the self-assembly process. The relationship between the model's rule set and
its resulting self-assembled structure can be viewed as a genotype-phenotype
map and incorporated into a genetic algorithm. The rule sets evolve under
selection for specified target structures. The corresponding, complex fitness
landscape generates rich evolutionary dynamics as a function of parameters such
as the population size, search space size, mutation rate, and method of
recombination. Furthermore, these systems are simple enough that in some cases
the associated model genome space can be completely characterised, shedding
light on how the evolutionary dynamics depends on the detailed structure of the
fitness landscape. Finally, we apply the model to study the emergence of the
preference for dihedral over cyclic symmetry observed for homomeric protein
tetramers
Contextualizing context for synthetic biology--identifying causes of failure of synthetic biological systems.
Despite the efforts that bioengineers have exerted in designing and constructing biological processes that function according to a predetermined set of rules, their operation remains fundamentally circumstantial. The contextual situation in which molecules and single-celled or multi-cellular organisms find themselves shapes the way they interact, respond to the environment and process external information. Since the birth of the field, synthetic biologists have had to grapple with contextual issues, particularly when the molecular and genetic devices inexplicably fail to function as designed when tested in vivo. In this review, we set out to identify and classify the sources of the unexpected divergences between design and actual function of synthetic systems and analyze possible methodologies aimed at controlling, if not preventing, unwanted contextual issues
Knowledge-based energy functions for computational studies of proteins
This chapter discusses theoretical framework and methods for developing
knowledge-based potential functions essential for protein structure prediction,
protein-protein interaction, and protein sequence design. We discuss in some
details about the Miyazawa-Jernigan contact statistical potential,
distance-dependent statistical potentials, as well as geometric statistical
potentials. We also describe a geometric model for developing both linear and
non-linear potential functions by optimization. Applications of knowledge-based
potential functions in protein-decoy discrimination, in protein-protein
interactions, and in protein design are then described. Several issues of
knowledge-based potential functions are finally discussed.Comment: 57 pages, 6 figures. To be published in a book by Springe
- …