13 research outputs found
Higher-order fluctuations in dense random graph models
Our main results are quantitative bounds in the multivariate normal
approximation of centred subgraph counts in random graphs generated by a
general graphon and independent vertex labels. The main motivation to
investigate these statistics is the fact that they are key to understanding
fluctuations of regular subgraph counts -- the cornerstone of dense graph limit
theory -- since they act as an orthogonal basis of a corresponding space.
We also identify the resulting limiting Gaussian stochastic measures by means
of the theory of generalised U-statistics and Gaussian Hilbert spaces, which we
think is a suitable framework to describe and understand higher-order
fluctuations in dense random graph models. With this article, we believe we
answer the question "What is the central limit theorem of dense graph limit
theory?".Comment: 28 page
On asymptotic joint distributions of cherries and pitchforks for random phylogenetic trees
Tree shape statistics provide valuable quantitative insights into evolutionary mechanisms underpinning phylogenetic trees, a commonly used graph representation of evolutionary relationships among taxonomic units ranging from viruses to species. We study two subtree counting statistics, the number of cherries and the number of pitchforks, for random phylogenetic trees generated by two widely used null tree models: the proportional to distinguishable arrangements (PDA) and the Yule-Harding-Kingman (YHK) models. By developing limit theorems for a version of extended Pólya urn models in which negative entries are permitted for their replacement matrices, we deduce the strong laws of large numbers and the central limit theorems for the joint distributions of these two counting statistics for the PDA and the YHK models. Our results indicate that the limiting behaviour of these two statistics, when appropriately scaled using the number of leaves in the underlying trees, is independent of the initial tree used in the tree generating process
Distributions of 4-subtree patterns for uniform random unrooted phylogenetic trees
Tree shape statistics based on peripheral structures have been utilized to study evolutionary mechanisms and inference methods. Partially motivated by a recent study by Pouryahya and Sankoff on modeling the accumulation of subgenomes in the evolution of polyploids, we present the distribution of subtree patterns with four or fewer leaves for the unrooted Proportional to Distinguishable Arrangements (PDA) model. We derive a recursive formula for computing the joint distributions, as well as a Strong Law of Large Numbers and a Central Limit Theorem for the joint distributions. This enables us to confirm several conjectures proposed by Pouryahya and Sankoff, as well as provide some theoretical insights into their observations. Based on their empirical datasets, we demonstrate that the statistical test based on the joint distribution could be more sensitive than those based on one individual subtree pattern to detect the existence of evolutionary forces such as whole genome duplication
Higher-order fluctuations in dense random graph models
10.1214/21-ejp708Electronic Journal of Probability2613
Distributions of cherries and pitchforks for the Ford model
We study two fringe subtree counting statistics, the number of cherries and that of pitchforks for Ford's α model, a one-parameter family of random phylogenetic tree models that includes the uniform and the Yule models, two tree models commonly used in phylogenetics. Based on a nonuniform version of the extended Pólya urn models in which negative entries are permitted for their replacement matrices, we obtain the strong law of large numbers and the central limit theorem for the joint distribution of these two count statistics for the Ford model. Furthermore, we derive a recursive formula for computing the exact joint distribution of these two statistics. This leads to exact formulas for their means and higher order asymptotic expansions of their second moments, which allows us to identify a critical parameter value for the correlation between these two statistics. That is, when n is sufficiently large, they are negatively correlated for 0≤α≤1/2 and positively correlated for 1/2<α<1
Reconstructing an Epidemic Outbreak Using Steiner Connectivity
Only a subset of infections is actually observed in an outbreak, due to multiple reasons such as asymptomatic cases and under-reporting. Therefore, reconstructing an epidemic cascade given some observed cases is an important step in responding to such an outbreak. A maximum likelihood solution to this problem ( referred to as CascadeMLE ) can be shown to be a variation of the classical Steiner subgraph problem, which connects a subset of observed infections. In contrast to prior works on epidemic reconstruction, which consider the standard Steiner tree objective, we show that a solution to CascadeMLE, based on the actual MLE objective, has a very different structure. We design a logarithmic approximation algorithm for CascadeMLE, and evaluate it on multiple synthetic and social contact networks, including a contact network constructed for a hospital. Our algorithm has significantly better performance compared to a prior baseline
GIST- A Rare Tumor in Paediatric Age Group
A 13 years old male child presented to us with complaints of pain abdomen, radiating towards back. Child underwent CT-abdomen with contrast which showed volvulus and child was operated immediately and resected part of gut was sent for histopathology examination. Histopathology report showed features suggestive of gastrointestinal stromal tumour and tumour marker CD-117 was sent for confirmation and report was found to be positive