13 research outputs found

    Higher-order fluctuations in dense random graph models

    Full text link
    Our main results are quantitative bounds in the multivariate normal approximation of centred subgraph counts in random graphs generated by a general graphon and independent vertex labels. The main motivation to investigate these statistics is the fact that they are key to understanding fluctuations of regular subgraph counts -- the cornerstone of dense graph limit theory -- since they act as an orthogonal basis of a corresponding L2L_2 space. We also identify the resulting limiting Gaussian stochastic measures by means of the theory of generalised U-statistics and Gaussian Hilbert spaces, which we think is a suitable framework to describe and understand higher-order fluctuations in dense random graph models. With this article, we believe we answer the question "What is the central limit theorem of dense graph limit theory?".Comment: 28 page

    On asymptotic joint distributions of cherries and pitchforks for random phylogenetic trees

    Get PDF
    Tree shape statistics provide valuable quantitative insights into evolutionary mechanisms underpinning phylogenetic trees, a commonly used graph representation of evolutionary relationships among taxonomic units ranging from viruses to species. We study two subtree counting statistics, the number of cherries and the number of pitchforks, for random phylogenetic trees generated by two widely used null tree models: the proportional to distinguishable arrangements (PDA) and the Yule-Harding-Kingman (YHK) models. By developing limit theorems for a version of extended Pólya urn models in which negative entries are permitted for their replacement matrices, we deduce the strong laws of large numbers and the central limit theorems for the joint distributions of these two counting statistics for the PDA and the YHK models. Our results indicate that the limiting behaviour of these two statistics, when appropriately scaled using the number of leaves in the underlying trees, is independent of the initial tree used in the tree generating process

    Distributions of 4-subtree patterns for uniform random unrooted phylogenetic trees

    Get PDF
    Tree shape statistics based on peripheral structures have been utilized to study evolutionary mechanisms and inference methods. Partially motivated by a recent study by Pouryahya and Sankoff on modeling the accumulation of subgenomes in the evolution of polyploids, we present the distribution of subtree patterns with four or fewer leaves for the unrooted Proportional to Distinguishable Arrangements (PDA) model. We derive a recursive formula for computing the joint distributions, as well as a Strong Law of Large Numbers and a Central Limit Theorem for the joint distributions. This enables us to confirm several conjectures proposed by Pouryahya and Sankoff, as well as provide some theoretical insights into their observations. Based on their empirical datasets, we demonstrate that the statistical test based on the joint distribution could be more sensitive than those based on one individual subtree pattern to detect the existence of evolutionary forces such as whole genome duplication

    Higher-order fluctuations in dense random graph models

    No full text
    10.1214/21-ejp708Electronic Journal of Probability2613

    Distributions of cherries and pitchforks for the Ford model

    No full text
    We study two fringe subtree counting statistics, the number of cherries and that of pitchforks for Ford's α model, a one-parameter family of random phylogenetic tree models that includes the uniform and the Yule models, two tree models commonly used in phylogenetics. Based on a nonuniform version of the extended Pólya urn models in which negative entries are permitted for their replacement matrices, we obtain the strong law of large numbers and the central limit theorem for the joint distribution of these two count statistics for the Ford model. Furthermore, we derive a recursive formula for computing the exact joint distribution of these two statistics. This leads to exact formulas for their means and higher order asymptotic expansions of their second moments, which allows us to identify a critical parameter value for the correlation between these two statistics. That is, when n is sufficiently large, they are negatively correlated for 0≤α≤1/2 and positively correlated for 1/2<α<1

    Reconstructing an Epidemic Outbreak Using Steiner Connectivity

    No full text
    Only a subset of infections is actually observed in an outbreak, due to multiple reasons such as asymptomatic cases and under-reporting. Therefore, reconstructing an epidemic cascade given some observed cases is an important step in responding to such an outbreak. A maximum likelihood solution to this problem ( referred to as CascadeMLE ) can be shown to be a variation of the classical Steiner subgraph problem, which connects a subset of observed infections. In contrast to prior works on epidemic reconstruction, which consider the standard Steiner tree objective, we show that a solution to CascadeMLE, based on the actual MLE objective, has a very different structure. We design a logarithmic approximation algorithm for CascadeMLE, and evaluate it on multiple synthetic and social contact networks, including a contact network constructed for a hospital. Our algorithm has significantly better performance compared to a prior baseline

    GIST- A Rare Tumor in Paediatric Age Group

    No full text
    A 13 years old male child presented to us with complaints of pain abdomen, radiating towards back. Child underwent CT-abdomen with contrast which showed volvulus and child was operated immediately and resected part of gut was sent for histopathology examination. Histopathology report showed features suggestive of gastrointestinal stromal tumour and tumour marker CD-117 was sent for confirmation and report was found to be positive
    corecore