93,583 research outputs found
On the Complexity of Searching in Trees: Average-case Minimization
We focus on the average-case analysis: A function w : V -> Z+ is given which
defines the likelihood for a node to be the one marked, and we want the
strategy that minimizes the expected number of queries. Prior to this paper,
very little was known about this natural question and the complexity of the
problem had remained so far an open question.
We close this question and prove that the above tree search problem is
NP-complete even for the class of trees with diameter at most 4. This results
in a complete characterization of the complexity of the problem with respect to
the diameter size. In fact, for diameter not larger than 3 the problem can be
shown to be polynomially solvable using a dynamic programming approach.
In addition we prove that the problem is NP-complete even for the class of
trees of maximum degree at most 16. To the best of our knowledge, the only
known result in this direction is that the tree search problem is solvable in
O(|V| log|V|) time for trees with degree at most 2 (paths).
We match the above complexity results with a tight algorithmic analysis. We
first show that a natural greedy algorithm attains a 2-approximation.
Furthermore, for the bounded degree instances, we show that any optimal
strategy (i.e., one that minimizes the expected number of queries) performs at
most O(\Delta(T) (log |V| + log w(T))) queries in the worst case, where w(T) is
the sum of the likelihoods of the nodes of T and \Delta(T) is the maximum
degree of T. We combine this result with a non-trivial exponential time
algorithm to provide an FPTAS for trees with bounded degree
Principal components analysis in the space of phylogenetic trees
Phylogenetic analysis of DNA or other data commonly gives rise to a
collection or sample of inferred evolutionary trees. Principal Components
Analysis (PCA) cannot be applied directly to collections of trees since the
space of evolutionary trees on a fixed set of taxa is not a vector space. This
paper describes a novel geometrical approach to PCA in tree-space that
constructs the first principal path in an analogous way to standard linear
Euclidean PCA. Given a data set of phylogenetic trees, a geodesic principal
path is sought that maximizes the variance of the data under a form of
projection onto the path. Due to the high dimensionality of tree-space and the
nonlinear nature of this problem, the computational complexity is potentially
very high, so approximate optimization algorithms are used to search for the
optimal path. Principal paths identified in this way reveal and quantify the
main sources of variation in the original collection of trees in terms of both
topology and branch lengths. The approach is illustrated by application to
simulated sets of trees and to a set of gene trees from metazoan (animal)
species.Comment: Published in at http://dx.doi.org/10.1214/11-AOS915 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
The permutation-path coloring problem on trees
AbstractIn this paper we first show that the permutation-path coloring problem is NP-hard even for very restrictive instances like involutions, which are permutations that contain only cycles of length at most two, on both binary trees and on trees having only two vertices with degree greater than two, and for circular permutations, which are permutations that contain exactly one cycle, on trees with maximum degree greater than or equal to 4. We calculate a lower bound on the average complexity of the permutation-path coloring problem on arbitrary networks. Then we give combinatorial and asymptotic results for the permutation-path coloring problem on linear networks in order to show that the average number of colors needed to color any permutation on a linear network on n vertices is n/4+o(n). We extend these results and obtain an upper bound on the average complexity of the permutation-path coloring problem on arbitrary trees, obtaining exact results in the case of generalized star trees. Finally we explain how to extend these results for the involutions-path coloring problem on arbitrary trees
A tight upper bound for the path length of AVL trees
AbstractWe prove that the internal path length of an AVL tree of size N is bounded from above by 1.4404N(log2 N-log2log2N)+O(N) and show that this bound is achieved by an infinite family of AVL trees, each tree of which is not of maximal height. These results carry over to the comparison cost of brother trees
c-trie++: A Dynamic Trie Tailored for Fast Prefix Searches
Given a dynamic set of strings of total length whose characters
are drawn from an alphabet of size , a keyword dictionary is a data
structure built on that provides locate, prefix search, and update
operations on . Under the assumption that
characters fit into a single machine word , we propose a keyword dictionary
that represents in bits of space,
supporting all operations in expected time on an
input string of length in the word RAM model. This data structure is
underlined with an exhaustive practical evaluation, highlighting the practical
usefulness of the proposed data structure, especially for prefix searches - one
of the most elementary keyword dictionary operations
Dynamic Trees with Almost-Optimal Access Cost
An optimal binary search tree for an access sequence on elements is a static tree that minimizes the total search cost. Constructing perfectly optimal binary search trees is expensive so the most efficient algorithms construct almost optimal search trees. There exists a long literature of constructing almost optimal search trees dynamically, i.e., when the access pattern is not known in advance. All of these trees, e.g., splay trees and treaps, provide a multiplicative approximation to the optimal search cost.
In this paper we show how to maintain an almost optimal weighted binary search tree under access operations and insertions of new elements where the approximation is an additive constant. More technically, we maintain a tree in which the depth of the leaf holding an element e_i does not exceed min(log(W/w_i),log n)+O(1) where w_i is the number of times e_i was accessed and W is the total length of the access sequence.
Our techniques can also be used to encode a sequence of m symbols with a dynamic alphabetic code in O(m) time so that the encoding length is bounded by m(H+O(1)), where H is the entropy of the sequence. This is the first efficient algorithm for adaptive alphabetic coding that runs in constant time per symbol
- …