6,819 research outputs found
Fast Hierarchical Clustering and Other Applications of Dynamic Closest Pairs
We develop data structures for dynamic closest pair problems with arbitrary
distance functions, that do not necessarily come from any geometric structure
on the objects. Based on a technique previously used by the author for
Euclidean closest pairs, we show how to insert and delete objects from an
n-object set, maintaining the closest pair, in O(n log^2 n) time per update and
O(n) space. With quadratic space, we can instead use a quadtree-like structure
to achieve an optimal time bound, O(n) per update. We apply these data
structures to hierarchical clustering, greedy matching, and TSP heuristics, and
discuss other potential applications in machine learning, Groebner bases, and
local improvement algorithms for partition and placement problems. Experiments
show our new methods to be faster in practice than previously used heuristics.Comment: 20 pages, 9 figures. A preliminary version of this paper appeared at
the 9th ACM-SIAM Symp. on Discrete Algorithms, San Francisco, 1998, pp.
619-628. For source code and experimental results, see
http://www.ics.uci.edu/~eppstein/projects/pairs
Approximating the Held-Karp Bound for Metric TSP in Nearly Linear Time
We give a nearly linear time randomized approximation scheme for the
Held-Karp bound [Held and Karp, 1970] for metric TSP. Formally, given an
undirected edge-weighted graph on edges and , the
algorithm outputs in time, with high probability, a
-approximation to the Held-Karp bound on the metric TSP instance
induced by the shortest path metric on . The algorithm can also be used to
output a corresponding solution to the Subtour Elimination LP. We substantially
improve upon the running time achieved previously
by Garg and Khandekar. The LP solution can be used to obtain a fast randomized
-approximation for metric TSP which improves
upon the running time of previous implementations of Christofides' algorithm
Space- and Time-Efficient Algorithm for Maintaining Dense Subgraphs on One-Pass Dynamic Streams
While in many graph mining applications it is crucial to handle a stream of
updates efficiently in terms of {\em both} time and space, not much was known
about achieving such type of algorithm. In this paper we study this issue for a
problem which lies at the core of many graph mining applications called {\em
densest subgraph problem}. We develop an algorithm that achieves time- and
space-efficiency for this problem simultaneously. It is one of the first of its
kind for graph problems to the best of our knowledge.
In a graph , the "density" of a subgraph induced by a subset of
nodes is defined as , where is the set of
edges in with both endpoints in . In the densest subgraph problem, the
goal is to find a subset of nodes that maximizes the density of the
corresponding induced subgraph. For any , we present a dynamic
algorithm that, with high probability, maintains a -approximation
to the densest subgraph problem under a sequence of edge insertions and
deletions in a graph with nodes. It uses space, and has an
amortized update time of and a query time of . Here,
hides a O(\poly\log_{1+\epsilon} n) term. The approximation ratio
can be improved to at the cost of increasing the query time to
. It can be extended to a -approximation
sublinear-time algorithm and a distributed-streaming algorithm. Our algorithm
is the first streaming algorithm that can maintain the densest subgraph in {\em
one pass}. The previously best algorithm in this setting required
passes [Bahmani, Kumar and Vassilvitskii, VLDB'12]. The space required by our
algorithm is tight up to a polylogarithmic factor.Comment: A preliminary version of this paper appeared in STOC 201
Data Structures for Halfplane Proximity Queries and Incremental Voronoi Diagrams
We consider preprocessing a set of points in convex position in the
plane into a data structure supporting queries of the following form: given a
point and a directed line in the plane, report the point of that
is farthest from (or, alternatively, nearest to) the point among all points
to the left of line . We present two data structures for this problem.
The first data structure uses space and preprocessing
time, and answers queries in time, for any . The second data structure uses space and
polynomial preprocessing time, and answers queries in time. These
are the first solutions to the problem with query time and
space.
The second data structure uses a new representation of nearest- and
farthest-point Voronoi diagrams of points in convex position. This
representation supports the insertion of new points in clockwise order using
only amortized pointer changes, in addition to -time
point-location queries, even though every such update may make
combinatorial changes to the Voronoi diagram. This data structure is the first
demonstration that deterministically and incrementally constructed Voronoi
diagrams can be maintained in amortized pointer changes per operation
while keeping -time point-location queries.Comment: 17 pages, 6 figures. Various small improvements. To appear in
Algorithmic
Linear-Time Algorithms for Computing Maximum-Density Sequence Segments with Bioinformatics Applications
We study an abstract optimization problem arising from biomolecular sequence
analysis. For a sequence A of pairs (a_i,w_i) for i = 1,..,n and w_i>0, a
segment A(i,j) is a consecutive subsequence of A starting with index i and
ending with index j. The width of A(i,j) is w(i,j) = sum_{i <= k <= j} w_k, and
the density is (sum_{i<= k <= j} a_k)/ w(i,j). The maximum-density segment
problem takes A and two values L and U as input and asks for a segment of A
with the largest possible density among those of width at least L and at most
U. When U is unbounded, we provide a relatively simple, O(n)-time algorithm,
improving upon the O(n \log L)-time algorithm by Lin, Jiang and Chao. When both
L and U are specified, there are no previous nontrivial results. We solve the
problem in O(n) time if w_i=1 for all i, and more generally in
O(n+n\log(U-L+1)) time when w_i>=1 for all i.Comment: 23 pages, 13 figures. A significant portion of these results appeared
under the title, "Fast Algorithms for Finding Maximum-Density Segments of a
Sequence with Applications to Bioinformatics," in Proceedings of the Second
Workshop on Algorithms in Bioinformatics (WABI), volume 2452 of Lecture Notes
in Computer Science (Springer-Verlag, Berlin), R. Guigo and D. Gusfield
editors, 2002, pp. 157--17
- …