Search CORE

2,475 research outputs found

Generating constrained random graphs using multiple edge switches

Author: Cointet Jean-Philippe
Roth Camille
Tabourier Lionel
Publication venue
Publication date: 14/12/2010
Field of study

The generation of random graphs using edge swaps provides a reliable method to draw uniformly random samples of sets of graphs respecting some simple constraints, e.g. degree distributions. However, in general, it is not necessarily possible to access all graphs obeying some given con- straints through a classical switching procedure calling on pairs of edges. We therefore propose to get round this issue by generalizing this classical approach through the use of higher-order edge switches. This method, which we denote by "k-edge switching", makes it possible to progres- sively improve the covered portion of a set of constrained graphs, thereby providing an increasing, asymptotically certain confidence on the statistical representativeness of the obtained sample.Comment: 15 page

arXiv.org e-Print Archive

HAL - UPEC / UPEM

Sampling motif-constrained ensembles of networks

Author: Altmann Eduardo G.
Fischer Rico
Leitao Jorge C.
Peixoto Tiago P.
Publication venue: 'American Physical Society (APS)'
Publication date: 30/10/2015
Field of study

The statistical significance of network properties is conditioned on null models which satisfy spec- ified properties but that are otherwise random. Exponential random graph models are a principled theoretical framework to generate such constrained ensembles, but which often fail in practice, either due to model inconsistency, or due to the impossibility to sample networks from them. These problems affect the important case of networks with prescribed clustering coefficient or number of small connected subgraphs (motifs). In this paper we use the Wang-Landau method to obtain a multicanonical sampling that overcomes both these problems. We sample, in polynomial time, net- works with arbitrary degree sequences from ensembles with imposed motifs counts. Applying this method to social networks, we investigate the relation between transitivity and homophily, and we quantify the correlation between different types of motifs, finding that single motifs can explain up to 60% of the variation of motif profiles.Comment: Updated version, as published in the journal. 7 pages, 5 figures, one Supplemental Materia

arXiv.org e-Print Archive

Some results on more flexible versions of Graph Motif

Author: Rizzi Romeo
Sikora Florian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/09/2014
Field of study

The problems studied in this paper originate from Graph Motif, a problem introduced in 2006 in the context of biological networks. Informally speaking, it consists in deciding if a multiset of colors occurs in a connected subgraph of a vertex-colored graph. Due to the high rate of noise in the biological data, more flexible definitions of the problem have been outlined. We present in this paper two inapproximability results for two different optimization variants of Graph Motif: one where the size of the solution is maximized, the other when the number of substitutions of colors to obtain the motif from the solution is minimized. We also study a decision version of Graph Motif where the connectivity constraint is replaced by the well known notion of graph modularity. While the problem remains NP-complete, it allows algorithms in FPT for biologically relevant parameterizations

arXiv.org e-Print Archive

CiteSeerX

A generic algorithm for layout of biological networks

Author: A Varma
B Balasundaram
B Genc
B Genc
D Emig
D Koschützki
DP Bertsekas
DP Dobkin
E Grafahrend-Belau
F Schreiber
F Schreiber
FA Kolpakov
Falk Schreiber
H Kitano
JM Six
K Han
K Kojima
K Sugiyama
K Wegner
Kim Marriott
M Ehrenberg
M Forster
M Kanehisa
M Krull
M Sirava
M Wybrow
Michael Wybrow
MY Becker
P Eades
P Eades
P Eades
PD Karp
PD Karp
R Milo
T Dwyer
T Dwyer
T Dwyer
T Fruchterman
T Kamada
Tim Dwyer
W Basalaj
W Li
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

BackgroundBiological networks are widely used to represent processes in biological systems and to capture interactions and dependencies between biological entities. Their size and complexity is steadily increasing due to the ongoing growth of knowledge in the life sciences. To aid understanding of biological networks several algorithms for laying out and graphically representing networks and network analysis results have been developed. However, current algorithms are specialized to particular layout styles and therefore different algorithms are required for each kind of network and/or style of layout. This increases implementation effort and means that new algorithms must be developed for new layout styles. Furthermore, additional effort is necessary to compose different layout conventions in the same diagram. Also the user cannot usually customize the placement of nodes to tailor the layout to their particular need or task and there is little support for interactive network exploration.ResultsWe present a novel algorithm to visualize different biological networks and network analysis results in meaningful ways depending on network types and analysis outcome. Our method is based on constrained graph layout and we demonstrate how it can handle the drawing conventions used in biological networks.ConclusionThe presented algorithm offers the ability to produce many of the fundamental popular drawing styles while allowing the exibility of constraints to further tailor these layouts.publishe

Springer - Publisher Connector

Towards comprehensive structural motif mining for better fold annotation in the "twilight zone" of sequence dissimilarity

Author: Jintao Zhang
Jun Huan
Leonidas N. Carayannopoulos
Leonidas N. Carayannopoulos
Vincent Buhr
Yi Jia
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background: Automatic identification of structure fingerprints from a group of diverse protein structures is challenging, especially for proteins whose divergent amino acid sequences may fall into the “twilight-” or “midnight– ” zones where pair-wise sequence identities to known sequences fall below 25 % and sequence-based functional annotations often fail. Results: Here we report a novel graph database mining method and demonstrate its application to protein structure pattern identification and structure classification. The biologic motivation of our study is to recognize common structure patterns in “immunoevasins”, proteins mediating virus evasion of host immune defense. Our experimental study, using both viral and non-viral proteins, demonstrates the efficiency and efficacy of the proposed method. Conclusions: We present a theoretic framework, offer a practical software implementation for incorporating prior domain knowledge, such as substitution matrices as studied here, and devise an efficient algorithm to identify approximate matched frequent subgraphs. By doing so, we significantly expanded the analytical power of sophisticated data mining algorithms in dealing with large volume of complicated and noisy protein structure data. And without loss of generality, choice of appropriate compatibility matrices allows our method to be easily employed in domains where subgraph labels have some uncertainty

CiteSeerX

Springer - Publisher Connector

Recommended from our members

A computer system to perform structure comparison using TOPS representations of protein structure

Author: Gilbert D
Thornton J
Viksna J
Westhead V
Publication venue: 'Elsevier BV'
Publication date: 01/01/2001
Field of study

We describe the design and implementation of a fast topology–based method for protein structure comparison. The approach uses the TOPS topological representation of protein structure, aligning two structures using a common discovered pattern and generating measure of distance derived from an insert score. Heavy use is made of a constraint-based pattern matching algorithm for TOPS diagrams that we have designed and described elsewhere Gilbert et al. (1999). The comparison system is maintained at the European Bioinformatics Institute and is available over the Web via the at tops.ebi.ac.uk/tops. Users submit a structure description in Protein Data Bank (PDB) format and can compare it with structures in the entire PDB or a representative subset of protein domains, receiving the results by email

Brunel University Research Archive