263,529 research outputs found
K-tree: Large Scale Document Clustering
We introduce K-tree in an information retrieval context. It is an efficient
approximation of the k-means clustering algorithm. Unlike k-means it forms a
hierarchy of clusters. It has been extended to address issues with sparse
representations. We compare performance and quality to CLUTO using document
collections. The K-tree has a low time complexity that is suitable for large
document collections. This tree structure allows for efficient disk based
implementations where space requirements exceed that of main memory.Comment: 2 pages, SIGIR 200
Fast Algorithms for Constructing Maximum Entropy Summary Trees
Karloff? and Shirley recently proposed summary trees as a new way to
visualize large rooted trees (Eurovis 2013) and gave algorithms for generating
a maximum-entropy k-node summary tree of an input n-node rooted tree. However,
the algorithm generating optimal summary trees was only pseudo-polynomial (and
worked only for integral weights); the authors left open existence of a
olynomial-time algorithm. In addition, the authors provided an additive
approximation algorithm and a greedy heuristic, both working on real weights.
This paper shows how to construct maximum entropy k-node summary trees in time
O(k^2 n + n log n) for real weights (indeed, as small as the time bound for the
greedy heuristic given previously); how to speed up the approximation algorithm
so that it runs in time O(n + (k^4/eps?) log(k/eps?)), and how to speed up the
greedy algorithm so as to run in time O(kn + n log n). Altogether, these
results make summary trees a much more practical tool than before.Comment: 17 pages, 4 figures. Extended version of paper appearing in ICALP
201
Approximating Directed Steiner Problems via Tree Embedding
In the k-edge connected directed Steiner tree (k-DST) problem, we are given a
directed graph G on n vertices with edge-costs, a root vertex r, a set of h
terminals T and an integer k. The goal is to find a min-cost subgraph H of G
that connects r to each terminal t by k edge-disjoint r,t-paths. This problem
includes as special cases the well-known directed Steiner tree (DST) problem
(the case k = 1) and the group Steiner tree (GST) problem. Despite having been
studied and mentioned many times in literature, e.g., by Feldman et al.
[SODA'09, JCSS'12], by Cheriyan et al. [SODA'12, TALG'14] and by Laekhanukit
[SODA'14], there was no known non-trivial approximation algorithm for k-DST for
k >= 2 even in the special case that an input graph is directed acyclic and has
a constant number of layers. If an input graph is not acyclic, the complexity
status of k-DST is not known even for a very strict special case that k= 2 and
|T| = 2.
In this paper, we make a progress toward developing a non-trivial
approximation algorithm for k-DST. We present an O(D k^{D-1} log
n)-approximation algorithm for k-DST on directed acyclic graphs (DAGs) with D
layers, which can be extended to a special case of k-DST on "general graphs"
when an instance has a D-shallow optimal solution, i.e., there exist k
edge-disjoint r,t-paths, each of length at most D, for every terminal t. For
the case k= 1 (DST), our algorithm yields an approximation ratio of O(D log h),
thus implying an O(log^3 h)-approximation algorithm for DST that runs in
quasi-polynomial-time (due to the height-reduction of Zelikovsky
[Algorithmica'97]). Consequently, as our algorithm works for general graphs, we
obtain an O(D k^{D-1} log n)-approximation algorithm for a D-shallow instance
of the k-edge-connected directed Steiner subgraph problem, where we wish to
connect every pair of terminals by k-edge-disjoint paths
A sufficiently fast algorithm for finding close to optimal clique trees
AbstractWe offer an algorithm that finds a clique tree such that the size of the largest clique is at most (2α+1)k where k is the size of the largest clique in a clique tree in which this size is minimized and α is the approximation ratio of an α-approximation algorithm for the 3-way vertex cut problem. When α=4/3, our algorithm's complexity is O(24.67kn·poly(n)) and it errs by a factor of 3.67 where poly(n) is the running time of linear programming. This algorithm is extended to find clique trees in which the state space of the largest clique is bounded. When k=O(logn), our algorithm yields a polynomial inference algorithm for Bayesian networks
On Finding the Adams Consensus Tree
This paper presents a fast algorithm for finding the Adams consensus tree of a set of conflicting phylogenetic trees with identical leaf labels, for the first time improving the time complexity of a widely used algorithm invented by Adams in 1972 [1]. Our algorithm applies
the centroid path decomposition technique [9] in a new way to traverse the input trees\u27 centroid paths in unison, and runs in O(k n log n) time, where k is the number of input trees and n is the size of the leaf label set. (In comparison, the old algorithm from 1972 has a worst-case running time of O(k n^2).) For the special case of k = 2, an even faster algorithm running in O(n cdot frac{log n}{loglog n}) time is provided, which relies on an extension of the wavelet tree-based technique by Bose et al. [6] for orthogonal range counting on a grid.
Our extended wavelet tree data structure also supports truncated
range maximum queries efficiently and may be of independent interest to algorithm designers
Stackelberg Network Pricing Games
We study a multi-player one-round game termed Stackelberg Network Pricing
Game, in which a leader can set prices for a subset of priceable edges in a
graph. The other edges have a fixed cost. Based on the leader's decision one or
more followers optimize a polynomial-time solvable combinatorial minimization
problem and choose a minimum cost solution satisfying their requirements based
on the fixed costs and the leader's prices. The leader receives as revenue the
total amount of prices paid by the followers for priceable edges in their
solutions, and the problem is to find revenue maximizing prices. Our model
extends several known pricing problems, including single-minded and unit-demand
pricing, as well as Stackelberg pricing for certain follower problems like
shortest path or minimum spanning tree. Our first main result is a tight
analysis of a single-price algorithm for the single follower game, which
provides a -approximation for any . This can
be extended to provide a -approximation for the
general problem and followers. The latter result is essentially best
possible, as the problem is shown to be hard to approximate within
\mathcal{O(\log^\epsilon k + \log^\epsilon m). If followers have demands, the
single-price algorithm provides a -approximation, and the
problem is hard to approximate within \mathcal{O(m^\epsilon) for some
. Our second main result is a polynomial time algorithm for
revenue maximization in the special case of Stackelberg bipartite vertex cover,
which is based on non-trivial max-flow and LP-duality techniques. Our results
can be extended to provide constant-factor approximations for any constant
number of followers
- …