32 research outputs found
A repertoire for additive functionals of uniformly distributed m-ary search trees
Using recent results on singularity analysis for Hadamard products of
generating functions, we obtain the limiting distributions for additive
functionals on -ary search trees on keys with toll sequence (i)
with ( and correspond roughly
to the space requirement and total path length, respectively); (ii) , which corresponds to the so-called shape functional; and (iii)
, which corresponds to the number of leaves.Comment: 26 pages; v2 expands on the introduction by comparing the results
with other probability model
Scalable String and Suffix Sorting: Algorithms, Techniques, and Tools
This dissertation focuses on two fundamental sorting problems: string sorting
and suffix sorting. The first part considers parallel string sorting on
shared-memory multi-core machines, the second part external memory suffix
sorting using the induced sorting principle, and the third part distributed
external memory suffix sorting with a new distributed algorithmic big data
framework named Thrill.Comment: 396 pages, dissertation, Karlsruher Instituts f\"ur Technologie
(2018). arXiv admin note: text overlap with arXiv:1101.3448 by other author
Congruence properties of depths in some random trees
Consider a random recusive tree with n vertices. We show that the number of
vertices with even depth is asymptotically normal as n tends to infinty. The
same is true for the number of vertices of depth divisible by m for m=3, 4 or
5; in all four cases the variance grows linearly. On the other hand, for m at
least 7, the number is not asymptotically normal, and the variance grows faster
than linear in n. The case m=6 is intermediate: the number is asymptotically
normal but the variance is of order n log n.
This is a simple and striking example of a type of phase transition that has
been observed by other authors in several cases. We prove, and perhaps explain,
this non-intuitive behavious using a translation to a generalized Polya urn.
Similar results hold for a random binary search tree; now the number of
vertices of depth divisible by m is asymptotically normal for m at most 8 but
not for m at least 9, and the variance grows linearly in the first case both
faster in the second. (There is no intermediate case.)
In contrast, we show that for conditioned Galton-Watson trees, including
random labelled trees and random binary trees, there is no such phase
transition: the number is asymptotically normal for every m.Comment: 23 page
Asymptotic distribution of two-protected nodes in ternary search trees
We study protected nodes in -ary search trees, by putting them in context
of generalised P\'olya urns. We show that the number of two-protected nodes
(the nodes that are neither leaves nor parents of leaves) in a random ternary
search tree is asymptotically normal. The methods apply in principle to -ary search trees with larger as well, although the size of the matrices
used in the calculations grow rapidly with ; we conjecture that the method
yields an asymptotically normal distribution for all .
The one-protected nodes, and their complement, i.e., the leaves, are easier
to analyze. By using a simpler P\'olya urn (that is similar to the one that has
earlier been used to study the total number of nodes in -ary search
trees), we prove normal limit laws for the number of one-protected nodes and
the number of leaves for all
Refined asymptotics for the number of leaves of random point quadtrees
In the early 2000s, several phase change results from distributional convergence to distributional non-convergence have been obtained for shape parameters of random discrete structures. Recently, for those random structures which admit a natural martingale process, these results have been considerably improved by obtaining refined asymptotics for the limit behavior. In this work, we propose a new approach which is also applicable to random discrete structures which do not admit a natural martingale process. As an example, we obtain refined asymptotics for the number of leaves in random point quadtrees. More applications, for example to shape parameters in generalized m-ary search trees and random gridtrees, will be discussed in the journal version of this extended abstract
The metaRbolomics Toolbox in Bioconductor and beyond
Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub