26,795 research outputs found
Branch merging on continuum trees with applications to regenerative tree growth
We introduce a family of branch merging operations on continuum trees and
show that Ford CRTs are distributionally invariant. This operation is new even
in the special case of the Brownian CRT, which we explore in more detail. The
operations are based on spinal decompositions and a regenerativity preserving
merging procedure of -strings of beads, that is, random
intervals equipped with a random discrete measure
arising in the limit of ordered -Chinese restaurant
processes as introduced recently by Pitman and Winkel. Indeed, we iterate the
branch merging operation recursively and give an alternative approach to the
leaf embedding problem on Ford CRTs related to -regenerative tree growth processes.Comment: 40 pages, 5 figure
On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching
We present parallel algorithms for exact and approximate pattern matching
with suffix arrays, using a CREW-PRAM with processors. Given a static text
of length , we first show how to compute the suffix array interval of a
given pattern of length in
time for . For approximate pattern matching with differences or
mismatches, we show how to compute all occurrences of a given pattern in
time, where is the size of the alphabet
and . The workhorse of our algorithms is a data structure
for merging suffix array intervals quickly: Given the suffix array intervals
for two patterns and , we present a data structure for computing the
interval of in sequential time, or in
parallel time. All our data structures are of size bits (in addition to
the suffix array)
On the (co)homology of the poset of weighted partitions
We consider the poset of weighted partitions , introduced by
Dotsenko and Khoroshkin in their study of a certain pair of dual operads. The
maximal intervals of provide a generalization of the lattice
of partitions, which we show possesses many of the well-known properties of
. In particular, we prove these intervals are EL-shellable, we show that
the M\"obius invariant of each maximal interval is given up to sign by the
number of rooted trees on on node set having a fixed number
of descents, we find combinatorial bases for homology and cohomology, and we
give an explicit sign twisted -module isomorphism from
cohomology to the multilinear component of the free Lie algebra with two
compatible brackets. We also show that the characteristic polynomial of
has a nice factorization analogous to that of .Comment: 50 pages, final version, to appear in Trans. AM
Transport on river networks: A dynamical approach
This study is motivated by problems related to environmental transport on
river networks. We establish statistical properties of a flow along a directed
branching network and suggest its compact parameterization. The downstream
network transport is treated as a particular case of nearest-neighbor
hierarchical aggregation with respect to the metric induced by the branching
structure of the river network. We describe the static geometric structure of a
drainage network by a tree, referred to as the static tree, and introduce an
associated dynamic tree that describes the transport along the static tree. It
is well known that the static branching structure of river networks can be
described by self-similar trees (SSTs); we demonstrate that the corresponding
dynamic trees are also self-similar. We report an unexpected phase transition
in the dynamics of three river networks, one from California and two from
Italy, demonstrate the universal features of this transition, and seek to
interpret it in hydrological terms.Comment: 38 pages, 15 figure
Decision Stream: Cultivating Deep Decision Trees
Various modifications of decision trees have been extensively used during the
past years due to their high efficiency and interpretability. Tree node
splitting based on relevant feature selection is a key step of decision tree
learning, at the same time being their major shortcoming: the recursive nodes
partitioning leads to geometric reduction of data quantity in the leaf nodes,
which causes an excessive model complexity and data overfitting. In this paper,
we present a novel architecture - a Decision Stream, - aimed to overcome this
problem. Instead of building a tree structure during the learning process, we
propose merging nodes from different branches based on their similarity that is
estimated with two-sample test statistics, which leads to generation of a deep
directed acyclic graph of decision rules that can consist of hundreds of
levels. To evaluate the proposed solution, we test it on several common machine
learning problems - credit scoring, twitter sentiment analysis, aircraft flight
control, MNIST and CIFAR image classification, synthetic data classification
and regression. Our experimental results reveal that the proposed approach
significantly outperforms the standard decision tree learning methods on both
regression and classification tasks, yielding a prediction error decrease up to
35%
- …