79 research outputs found
Hypertree posets and hooked partitions
We adapt here the computation of characters on incidence Hopf algebras
introduced by W. Schmitt in the 1990s to a family mixing bounded and unbounded
posets. We then apply our results to the family of hypertree posets and
partition posets. As a consequence, we obtain some enumerative formulas and a
new proof for the computation of the Moebius numbers of the hypertree posets.
Moreover, we compute the coproduct of the incidence Hopf algebra and recover a
known formula for the number of hypertrees with fixed valency set and edge
sizes set.Comment: 18 page
A clique-difference encoding scheme for labelled k-path graphs
AbstractWe present in this paper a codeword for labelled k-path graphs. Structural properties of this codeword are investigated, leading to the solution of two important problems: determining the exact number of labelled k-path graphs with n vertices and locating a hamiltonian path in a given k-path graph in time O(n). The corresponding encoding scheme is also presented, providing linear-time algorithms for encoding and decoding
An edge-weighted hook formula for labelled trees
A number of hook formulas and hook summation formulas have previously
appeared, involving various classes of trees. One of these classes of trees is
rooted trees with labelled vertices, in which the labels increase along every
chain from the root vertex to a leaf. In this paper we give a new hook
summation formula for these (unordered increasing) trees, by introducing a new
set of indeterminates indexed by pairs of vertices, that we call edge weights.
This new result generalizes a previous result by F\'eray and Goulden, that
arose in the context of representations of the symmetric group via the study of
Kerov's character polynomials. Our proof is by means of a combinatorial
bijection that is a generalization of the Pr\"ufer code for labelled trees.Comment: 25 pages, 9 figures. Author-produced copy of the article to appear in
Journal of Combinatorics, including referee's suggestion
Leaping through tree space: continuous phylogenetic inference for rooted and unrooted trees
Phylogenetics is now fundamental in life sciences, providing insights into
the earliest branches of life and the origins and spread of epidemics. However,
finding suitable phylogenies from the vast space of possible trees remains
challenging. To address this problem, for the first time, we perform both tree
exploration and inference in a continuous space where the computation of
gradients is possible. This continuous relaxation allows for major leaps across
tree space in both rooted and unrooted trees, and is less susceptible to
convergence to local minima. Our approach outperforms the current best methods
for inference on unrooted trees and, in simulation, accurately infers the tree
and root in ultrametric cases. The approach is effective in cases of empirical
data with negligible amounts of data, which we demonstrate on the phylogeny of
jawed vertebrates. Indeed, only a few genes with an ultrametric signal were
generally sufficient for resolving the major lineages of vertebrate. With
cubic-time complexity and efficient optimisation via automatic differentiation,
our method presents an effective way forwards for exploring the most difficult,
data-deficient phylogenetic questions.Comment: 13 pages, 4 figures, 14 supplementary pages, 2 supplementary figure
Combinatorial Enumeration of Graphs
In this chapter, I will talk about some of the enumerative combinatorics problems that have interested researchers during the last decades. For some of those enumeration problems, it is possible to obtain closed mathematical expressions, and for some other it is possible to obtain an estimation by the use of asymptotic methods. Some of the methods used in both cases will be covered in this chapter as well as some application of graph enumeration in different fields. An overview about the enumeration of trees will be given as an example of combinatorial problem solved in a closed mathematical form. Similarly, the problem of enumeration of regular graphs will be discussed as an example of combinatorial enumeration for which it is hard to obtain a closed mathematical form solution and apply the asymptotic estimation method used frequently in analytic combinatorics for this end. An example of application of the enumerative combinatorics for obtaining a result of applicability criteria of selection nodes in a virus spreading control problem will be given as well
Evaluating, Accelerating and Extending the Multispecies Coalescent Model of Evolution
So much research builds on evolutionary histories of species and
genes. They are used in genomics to infer synteny, in ecology to
describe and predict biodiversity, and in molecular biology to
transfer knowledge acquired in model organisms to humans and
crops. Beyond downstream applications, expanding our knowledge of
life on Earth is important in its own right. From Naturalis
Historia to On the Origin of Species, the acquisition of this
knowledge has been a part of human development.
Evolutionary histories are commonly represented as trees, where a
common ancestor progressively splits into descendant species or
alleles. Time trees add more information by using height to
represent genetic distance or elapsed time. Species and gene
trees can be inferred from molecular sequences using methods
which are explicitly model-based, or implicitly assume or are
statistically consistent with a particular model of evolution.
One such model, the multispecies coalescent (MSC), is the topic
of my thesis. Under this model, separate trees are inferred for
the species history and for each gene’s history. Gene trees are
embedded within the species tree according to a coalescent
process.
Researchers often avoid the MSC when reconstructing time trees
because of claims that available implementations are too
computationally demanding. Instead, the species history is
inferred using a single tree by concatenating the sequences from
each gene. I began my thesis research by evaluating the effect of
this approximation. In a realistic simulation based on parameters
inferred from empirical data, concatenation was grossly
inaccurate, especially when estimating recent species divergence
times. In a later simulation study I demonstrated that when using
concatenation, credible intervals often excluded the true
values.
To address reluctance towards using the MSC, I developed a faster
implementation of the model. StarBEAST2 is a Markov chain Monte
Carlo (MCMC) method, meaning it characterizes the probability
distribution over trees by randomly walking the parameter space.
I improved computational performance by developing more efficient
proposals used to traverse the space, and reducing the number of
parameters in the model through analytical integration of
population sizes.
Despite its sophistication, the MSC has theoretical limitations.
One is that the substitution rate is assumed to stay constant, or
uncorrelated between lineages of different genes. However
substitution rates do vary and are associated with species traits
like body size. I addressed this assumption in StarBEAST2 by
extending the MSC to estimate substitution rates for each
species. Another assumption is that genetic material cannot be
transferred horizontally, but a more general model called the
multispecies network coalescent (MSNC) permits introgression of
alleles across species boundaries. My collaborators and I have
developed and evaluated an MCMC implementation of the the MSNC.
My final thesis project was to combine the MSC with the
fossilized birth-death (FBD) process, which models how species
are fossilized and sampled through time. To demonstrate the
utility of the FBD-MSC model, I used it to reconstruct the
evolutionary history of Caninae (dogs and foxes) using fossil
data and molecular sequences
- …