Search CORE

263,529 research outputs found

K-tree: Large Scale Document Clustering

Author: De Vries Christopher M.
Geva Shlomo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.Comment: 2 pages, SIGIR 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

Queensland University of Technology ePrints Archive

Fast Algorithms for Constructing Maximum Entropy Summary Trees

Author: J. Naudts
T. Landesberger von
T. Munzner
Publication venue
Publication date: 01/01/2014
Field of study

Karloff? and Shirley recently proposed summary trees as a new way to visualize large rooted trees (Eurovis 2013) and gave algorithms for generating a maximum-entropy k-node summary tree of an input n-node rooted tree. However, the algorithm generating optimal summary trees was only pseudo-polynomial (and worked only for integral weights); the authors left open existence of a olynomial-time algorithm. In addition, the authors provided an additive approximation algorithm and a greedy heuristic, both working on real weights. This paper shows how to construct maximum entropy k-node summary trees in time O(k^2 n + n log n) for real weights (indeed, as small as the time bound for the greedy heuristic given previously); how to speed up the approximation algorithm so that it runs in time O(n + (k^4/eps?) log(k/eps?)), and how to speed up the greedy algorithm so as to run in time O(kn + n log n). Altogether, these results make summary trees a much more practical tool than before.Comment: 17 pages, 4 figures. Extended version of paper appearing in ICALP 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Approximating Directed Steiner Problems via Tree Embedding

Author: Laekhanukit Bundit
Publication venue
Publication date: 01/01/2016
Field of study

In the k-edge connected directed Steiner tree (k-DST) problem, we are given a directed graph G on n vertices with edge-costs, a root vertex r, a set of h terminals T and an integer k. The goal is to find a min-cost subgraph H of G that connects r to each terminal t by k edge-disjoint r,t-paths. This problem includes as special cases the well-known directed Steiner tree (DST) problem (the case k = 1) and the group Steiner tree (GST) problem. Despite having been studied and mentioned many times in literature, e.g., by Feldman et al. [SODA'09, JCSS'12], by Cheriyan et al. [SODA'12, TALG'14] and by Laekhanukit [SODA'14], there was no known non-trivial approximation algorithm for k-DST for k >= 2 even in the special case that an input graph is directed acyclic and has a constant number of layers. If an input graph is not acyclic, the complexity status of k-DST is not known even for a very strict special case that k= 2 and |T| = 2. In this paper, we make a progress toward developing a non-trivial approximation algorithm for k-DST. We present an O(D k^{D-1} log n)-approximation algorithm for k-DST on directed acyclic graphs (DAGs) with D layers, which can be extended to a special case of k-DST on "general graphs" when an instance has a D-shallow optimal solution, i.e., there exist k edge-disjoint r,t-paths, each of length at most D, for every terminal t. For the case k= 1 (DST), our algorithm yields an approximation ratio of O(D log h), thus implying an O(log^3 h)-approximation algorithm for DST that runs in quasi-polynomial-time (due to the height-reduction of Zelikovsky [Algorithmica'97]). Consequently, as our algorithm works for general graphs, we obtain an O(D k^{D-1} log n)-approximation algorithm for a D-shallow instance of the k-edge-connected directed Steiner subgraph problem, where we wish to connect every pair of terminals by k-edge-disjoint paths

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

A sufficiently fast algorithm for finding close to optimal clique trees

Author: Becker Ann
Geiger Dan
Publication venue: Elsevier Science B.V.
Publication date: 31/01/2001
Field of study

AbstractWe offer an algorithm that finds a clique tree such that the size of the largest clique is at most (2α+1)k where k is the size of the largest clique in a clique tree in which this size is minimized and α is the approximation ratio of an α-approximation algorithm for the 3-way vertex cut problem. When α=4/3, our algorithm's complexity is O(24.67kn·poly(n)) and it errs by a factor of 3.67 where poly(n) is the running time of linear programming. This algorithm is extended to find clique trees in which the state space of the largest clique is bounded. When k=O(logn), our algorithm yields a polynomial inference algorithm for Bayesian networks

Elsevier - Publisher Connector

On Finding the Adams Consensus Tree

Author: Jansson Jesper
Li Zhaoxian
Sung Wing-Kin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 32nd International Symposium on Theoretical Aspects of Computer Science (STACS 2015)
Publication date: 01/01/2015
Field of study

This paper presents a fast algorithm for finding the Adams consensus tree of a set of conflicting phylogenetic trees with identical leaf labels, for the first time improving the time complexity of a widely used algorithm invented by Adams in 1972 [1]. Our algorithm applies the centroid path decomposition technique [9] in a new way to traverse the input trees\u27 centroid paths in unison, and runs in O(k n log n) time, where k is the number of input trees and n is the size of the leaf label set. (In comparison, the old algorithm from 1972 has a worst-case running time of O(k n^2).) For the special case of k = 2, an even faster algorithm running in O(n cdot frac{log n}{loglog n}) time is provided, which relies on an extension of the wavelet tree-based technique by Bose et al. [6] for orthogonal range counting on a grid. Our extended wavelet tree data structure also supports truncated range maximum queries efficiently and may be of independent interest to algorithm designers

Dagstuhl Research Online Publication Server

Stackelberg Network Pricing Games

Author: Briest Patrick
Hoefer Martin
Krysta Piotr
Publication venue
Publication date: 01/01/2007
Field of study

We study a multi-player one-round game termed Stackelberg Network Pricing Game, in which a leader can set prices for a subset of

m

priceable edges in a graph. The other edges have a fixed cost. Based on the leader's decision one or more followers optimize a polynomial-time solvable combinatorial minimization problem and choose a minimum cost solution satisfying their requirements based on the fixed costs and the leader's prices. The leader receives as revenue the total amount of prices paid by the followers for priceable edges in their solutions, and the problem is to find revenue maximizing prices. Our model extends several known pricing problems, including single-minded and unit-demand pricing, as well as Stackelberg pricing for certain follower problems like shortest path or minimum spanning tree. Our first main result is a tight analysis of a single-price algorithm for the single follower game, which provides a

(1+\epsilon) \log m

-approximation for any

\epsilon >0

. This can be extended to provide a

(1+\epsilon)(\log k + \log m)

-approximation for the general problem and

k

followers. The latter result is essentially best possible, as the problem is shown to be hard to approximate within \mathcal{O(\log^\epsilon k + \log^\epsilon m). If followers have demands, the single-price algorithm provides a

(1+\epsilon)m^2

-approximation, and the problem is hard to approximate within \mathcal{O(m^\epsilon) for some

\epsilon >0

. Our second main result is a polynomial time algorithm for revenue maximization in the special case of Stackelberg bipartite vertex cover, which is based on non-trivial max-flow and LP-duality techniques. Our results can be extended to provide constant-factor approximations for any constant number of followers

arXiv.org e-Print Archive

CiteSeerX

HAL Descartes

Dagstuhl Research Online Publication Server

Publikationsserver der RWTH Aachen University

Hal-Diderot