6,265 research outputs found

    Compression of dynamic graphs generated by a duplication model

    Get PDF
    We continue building up the information theory of non-sequential data structures such as trees, sets, and graphs. In this paper, we consider dynamic graphs generated by a full duplication model in which a new vertex selects an existing vertex and copies all of its neighbors. We ask how many bits are needed to describe the labeled and unlabeled versions of such graphs. We first estimate entropies of both versions and then present asymptotically optimal compression algorithms up to two bits. Interestingly, for the full duplication model the labeled version needs Θ(n) bits while its unlabeled version (structure) can be described by Θ(logn) bits due to significant amount of symmetry (i.e. large average size of the automorphism group of sample graphs)

    Power-Law Degree Distribution in the Connected Component of a Duplication Graph

    Get PDF
    We study the partial duplication dynamic graph model, introduced by Bhan et al. in [Bhan et al., 2002] in which a newly arrived node selects randomly an existing node and connects with probability p to its neighbors. Such a dynamic network is widely considered to be a good model for various biological networks such as protein-protein interaction networks. This model is discussed in numerous publications with only a few recent rigorous results, especially for the degree distribution. Recently Jordan [Jordan, 2018] proved that for 0 < p < 1/e the degree distribution of the connected component is stationary with approximately a power law. In this paper we rigorously prove that the tail is indeed a true power law, that is, we show that the degree of a randomly selected node in the connected component decays like C/k^? where C an explicit constant and ? ? 2 is a non-trivial solution of p^(?-2) + ? - 3 = 0. This holds regardless of the structure of the initial graph, as long as it is connected and has at least two vertices. To establish this finding we apply analytic combinatorics tools, in particular Mellin transform and singularity analysis

    Gene duplication and subsequent diversification strongly affect phenotypic evolvability and robustness.

    Get PDF
    We study the effects of non-determinism and gene duplication on the structure of genotype-phenotype (GP) maps by introducing a non-deterministic version of the Polyomino self-assembly model. This model has previously been used in a variety of contexts to model the assembly and evolution of protein quaternary structure. Firstly, we show the limit of the current deterministic paradigm which leads to built-in anti-correlation between evolvability and robustness at the genotypic level. We develop a set of metrics to measure structural properties of GP maps in a non-deterministic setting and use them to evaluate the effects of gene duplication and subsequent diversification. Our generalized versions of evolvability and robustness exhibit positive correlation for a subset of genotypes. This positive correlation is only possible because non-deterministic phenotypes can contribute to both robustness and evolvability. Secondly, we show that duplication increases robustness and reduces evolvability initially, but that the subsequent diversification that duplication enables has a stronger, inverse effect, greatly increasing evolvability and reducing robustness relative to their original values

    A survey of statistical network models

    Full text link
    Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active network community and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online networking communities such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
    • …
    corecore