2 research outputs found

### Enumeration of coalescent histories for caterpillar species trees and $p$-pseudocaterpillar gene trees

For a fixed set $X$ containing $n$ taxon labels, an ordered pair consisting
of a gene tree topology $G$ and a species tree $S$ bijectively labeled with the
labels of $X$ possesses a set of coalescent histories -- mappings from the set
of internal nodes of $G$ to the set of edges of $S$ describing possible lists
of edges in $S$ on which the coalescences in $G$ take place. Enumerations of
coalescent histories for gene trees and species trees have produced suggestive
results regarding the pairs $(G,S)$ that, for a fixed $n$, have the largest
number of coalescent histories. We define a class of 2-cherry binary tree
topologies that we term $p$-pseudocaterpillars, examining coalescent histories
for non-matching pairs $(G,S)$, in the case in which $S$ has a caterpillar
shape and $G$ has a $p$-pseudocaterpillar shape. Using a construction that
associates coalescent histories for $(G,S)$ with a class of "roadblocked"
monotonic paths, we identify the $p$-pseudocaterpillar labeled gene tree
topology that, for a fixed caterpillar labeled species tree topology, gives
rise to the largest number of coalescent histories. The shape that maximizes
the number of coalescent histories places the "second" cherry of the
$p$-pseudocaterpillar equidistantly from the root of the "first" cherry and
from the tree root. A symmetry in the numbers of coalescent histories for
$p$-pseudocaterpillar gene trees and caterpillar species trees is seen to exist
around the maximizing value of the parameter $p$. The results provide insight
into the factors that influence the number of coalescent histories possible for
a given gene tree and species tree

### A compendium of covariances and correlation coefficients of coalescent tree properties

Gene genealogies are frequently studied by measuring properties such as their
height ($H$), length ($L$), sum of external branches ($E$), sum of internal
branches ($I$), and mean of their two basal branches ($B$), and the coalescence
times that contribute to the other genealogical features ($T$). These tree
properties and their relationships can provide insight into the effects of
population-genetic processes on genealogies and genetic sequences. Here, under
the coalescent model, we study the 15 correlations among pairs of features of
genealogical trees: $H_n$, $L_n$, $E_n$, $I_n$, $B_n$, and $T_k$ for a sample
of size $n$, with $2 \leq k \leq n$. We report high correlations among $H_n$,
$L_n$, $I_n,$ and $B_n$, with all pairwise correlations of these quantities
having values greater than or equal to $\sqrt{6} [6 \zeta(3) + 6 - \pi^2] / (
\pi \sqrt{18 + 9\pi^2 - \pi^4}) \approx 0.84930$ in the limit as $n \rightarrow
\infty$. Although $E_n$ has an expectation of 2 for all $n$ and $H_n$ has
expectation 2 in the limit as $n \rightarrow \infty$, their limiting
correlation is 0. The results contribute toward understanding features of the
shapes of coalescent trees