Search CORE

26,795 research outputs found

Branch merging on continuum trees with applications to regenerative tree growth

Author: Rembart Franz
Publication venue
Publication date: 01/01/2016
Field of study

We introduce a family of branch merging operations on continuum trees and show that Ford CRTs are distributionally invariant. This operation is new even in the special case of the Brownian CRT, which we explore in more detail. The operations are based on spinal decompositions and a regenerativity preserving merging procedure of

(\alpha, \theta)

-strings of beads, that is, random intervals

[0, L_{\alpha, \theta}]

equipped with a random discrete measure

dL^{-1}

arising in the limit of ordered

(\alpha, \theta)

-Chinese restaurant processes as introduced recently by Pitman and Winkel. Indeed, we iterate the branch merging operation recursively and give an alternative approach to the leaf embedding problem on Ford CRTs related to

(\alpha, 2-\alpha)

-regenerative tree growth processes.Comment: 40 pages, 5 figure

arXiv.org e-Print Archive

Oxford University Research Archive

On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching

Author: Fischer Johannes
Kurpicz Florian
Köppl Dominik
Publication venue
Publication date: 01/01/2016
Field of study

We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with

p

processors. Given a static text of length

n

, we first show how to compute the suffix array interval of a given pattern of length

m

O(\frac{m}{p}+ \lg p + \lg\lg p\cdot\lg\lg n)

time for

p \le m

. For approximate pattern matching with

k

differences or mismatches, we show how to compute all occurrences of a given pattern in

O(\frac{m^k\sigma^k}{p}\max\left(k,\lg\lg n\right)\!+\!(1+\frac{m}{p}) \lg p\cdot \lg\lg n + \text{occ})

time, where

\sigma

is the size of the alphabet and

p \le \sigma^k m^k

. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns

P

and

P'

, we present a data structure for computing the interval of

PP'

O(\lg\lg n)

sequential time, or in

O(1+\lg_p\lg n)

parallel time. All our data structures are of size

O(n)

bits (in addition to the suffix array)

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

On the (co)homology of the poset of weighted partitions

Author: D'León Rafael S. González
Wachs Michelle L.
Publication venue
Publication date: 04/04/2015
Field of study

We consider the poset of weighted partitions

\Pi_n^w

, introduced by Dotsenko and Khoroshkin in their study of a certain pair of dual operads. The maximal intervals of

\Pi_n^w

provide a generalization of the lattice

\Pi_n

of partitions, which we show possesses many of the well-known properties of

\Pi_n

. In particular, we prove these intervals are EL-shellable, we show that the M\"obius invariant of each maximal interval is given up to sign by the number of rooted trees on on node set

\{1,2,\dots,n\}

having a fixed number of descents, we find combinatorial bases for homology and cohomology, and we give an explicit sign twisted

\mathfrak{S}_n

-module isomorphism from cohomology to the multilinear component of the free Lie algebra with two compatible brackets. We also show that the characteristic polynomial of

\Pi_n^w

has a nice factorization analogous to that of

\Pi_n

.Comment: 50 pages, final version, to appear in Trans. AM

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of Miami: Scholarship Miami

Transport on river networks: A dynamical approach

Author: Foufoula-Georgiou Efi
Ghil Michael
Zaliapin Ilya
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 09/02/2009
Field of study

This study is motivated by problems related to environmental transport on river networks. We establish statistical properties of a flow along a directed branching network and suggest its compact parameterization. The downstream network transport is treated as a particular case of nearest-neighbor hierarchical aggregation with respect to the metric induced by the branching structure of the river network. We describe the static geometric structure of a drainage network by a tree, referred to as the static tree, and introduce an associated dynamic tree that describes the transport along the static tree. It is well known that the static branching structure of river networks can be described by self-similar trees (SSTs); we demonstrate that the corresponding dynamic trees are also self-similar. We report an unexpected phase transition in the dynamics of three river networks, one from California and two from Italy, demonstrate the universal features of this transition, and seek to interpret it in hydrological terms.Comment: 38 pages, 15 figure

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Decision Stream: Cultivating Deep Decision Trees

Author: Ignatov Andrey
Ignatov Dmitry
Publication venue
Publication date: 03/09/2017
Field of study

Various modifications of decision trees have been extensively used during the past years due to their high efficiency and interpretability. Tree node splitting based on relevant feature selection is a key step of decision tree learning, at the same time being their major shortcoming: the recursive nodes partitioning leads to geometric reduction of data quantity in the leaf nodes, which causes an excessive model complexity and data overfitting. In this paper, we present a novel architecture - a Decision Stream, - aimed to overcome this problem. Instead of building a tree structure during the learning process, we propose merging nodes from different branches based on their similarity that is estimated with two-sample test statistics, which leads to generation of a deep directed acyclic graph of decision rules that can consist of hundreds of levels. To evaluate the proposed solution, we test it on several common machine learning problems - credit scoring, twitter sentiment analysis, aircraft flight control, MNIST and CIFAR image classification, synthetic data classification and regression. Our experimental results reveal that the proposed approach significantly outperforms the standard decision tree learning methods on both regression and classification tasks, yielding a prediction error decrease up to 35%

arXiv.org e-Print Archive

Crossref