267,009 research outputs found
Finding Simple Shortest Paths and Cycles
The problem of finding multiple simple shortest paths in a weighted directed
graph has many applications, and is considerably more difficult than
the corresponding problem when cycles are allowed in the paths. Even for a
single source-sink pair, it is known that two simple shortest paths cannot be
found in time polynomially smaller than (where ) unless the
All-Pairs Shortest Paths problem can be solved in a similar time bound. The
latter is a well-known open problem in algorithm design. We consider the
all-pairs version of the problem, and we give a new algorithm to find
simple shortest paths for all pairs of vertices. For , our algorithm runs
in time (where ), which is almost the same bound as
for the single pair case, and for we improve earlier bounds. Our approach
is based on forming suitable path extensions to find simple shortest paths;
this method is different from the `detour finding' technique used in most of
the prior work on simple shortest paths, replacement paths, and distance
sensitivity oracles.
Enumerating simple cycles is a well-studied classical problem. We present new
algorithms for generating simple cycles and simple paths in in
non-decreasing order of their weights; the algorithm for generating simple
paths is much faster, and uses another variant of path extensions. We also give
hardness results for sparse graphs, relative to the complexity of computing a
minimum weight cycle in a graph, for several variants of problems related to
finding simple paths and cycles.Comment: The current version includes new results for undirected graphs. In
Section 4, the notion of an (m,n) reduction is generalized to an f(m,n)
reductio
Simplifying and Unifying Replacement Paths Algorithms in Weighted Directed Graphs
In the replacement paths (RP) problem we are given a graph G and a shortest path P between two nodes s and t . The goal is to find for every edge e ? P, a shortest path from s to t that avoids e. The first result of this paper is a simple reduction from the RP problem to the problem of computing shortest cycles for all nodes on a shortest path.
Using this simple reduction we unify and extremely simplify two state of the art solutions for two different well-studied variants of the RP problem.
In the first variant (algebraic) we show that by using at most n queries to the Yuster-Zwick distance oracle [FOCS 2005], one can solve the the RP problem for a given directed graph with integer edge weights in the range [-M,M] in O?(M n^?) time . This improves the running time of the state of the art algorithm of Vassilevska Williams [SODA 2011] by a factor of log?n.
In the second variant (planar) we show that by using the algorithm of Klein for the multiple-source shortest paths problem (MSSP) [SODA 2005] one can solve the RP problem for directed planar graph with non negative edge weights in O (n log n) time. This matches the state of the art algorithm of Wulff-Nilsen [SODA 2010], but with arguably much simpler algorithm and analysis
Replacement Paths via Row Minima of Concise Matrices
Matrix is {\em -concise} if the finite entries of each column of
consist of or less intervals of identical numbers. We give an -time
algorithm to compute the row minima of any -concise matrix.
Our algorithm yields the first -time reductions from the
replacement-paths problem on an -node -edge undirected graph
(respectively, directed acyclic graph) to the single-source shortest-paths
problem on an -node -edge undirected graph (respectively, directed
acyclic graph). That is, we prove that the replacement-paths problem is no
harder than the single-source shortest-paths problem on undirected graphs and
directed acyclic graphs. Moreover, our linear-time reductions lead to the first
-time algorithms for the replacement-paths problem on the following
classes of -node -edge graphs (1) undirected graphs in the word-RAM model
of computation, (2) undirected planar graphs, (3) undirected minor-closed
graphs, and (4) directed acyclic graphs.Comment: 23 pages, 1 table, 9 figures, accepted to SIAM Journal on Discrete
Mathematic
Sparse Fault-Tolerant BFS Trees
This paper addresses the problem of designing a sparse {\em fault-tolerant}
BFS tree, or {\em FT-BFS tree} for short, namely, a sparse subgraph of the
given network such that subsequent to the failure of a single edge or
vertex, the surviving part of still contains a BFS spanning tree for
(the surviving part of) . Our main results are as follows. We present an
algorithm that for every -vertex graph and source node constructs a
(single edge failure) FT-BFS tree rooted at with O(n \cdot
\min\{\Depth(s), \sqrt{n}\}) edges, where \Depth(s) is the depth of the BFS
tree rooted at . This result is complemented by a matching lower bound,
showing that there exist -vertex graphs with a source node for which any
edge (or vertex) FT-BFS tree rooted at has edges. We then
consider {\em fault-tolerant multi-source BFS trees}, or {\em FT-MBFS trees}
for short, aiming to provide (following a failure) a BFS tree rooted at each
source for some subset of sources . Again, tight bounds
are provided, showing that there exists a poly-time algorithm that for every
-vertex graph and source set of size constructs a
(single failure) FT-MBFS tree from each source , with
edges, and on the other hand there exist
-vertex graphs with source sets of cardinality , on
which any FT-MBFS tree from has edges.
Finally, we propose an approximation algorithm for constructing
FT-BFS and FT-MBFS structures. The latter is complemented by a hardness result
stating that there exists no approximation algorithm for these
problems under standard complexity assumptions
Unsupervised String Transformation Learning for Entity Consolidation
Data integration has been a long-standing challenge in data management with
many applications. A key step in data integration is entity consolidation. It
takes a collection of clusters of duplicate records as input and produces a
single "golden record" for each cluster, which contains the canonical value for
each attribute. Truth discovery and data fusion methods, as well as Master Data
Management (MDM) systems, can be used for entity consolidation. However, to
achieve better results, the variant values (i.e., values that are logically the
same with different formats) in the clusters need to be consolidated before
applying these methods.
For this purpose, we propose a data-driven method to standardize the variant
values based on two observations: (1) the variant values usually can be
transformed to the same representation (e.g., "Mary Lee" and "Lee, Mary") and
(2) the same transformation often appears repeatedly across different clusters
(e.g., transpose the first and last name). Our approach first uses an
unsupervised method to generate groups of value pairs that can be transformed
in the same way (i.e., they share a transformation). Then the groups are
presented to a human for verification and the approved ones are used to
standardize the data. In a real-world dataset with 17,497 records, our method
achieved 75% recall and 99.5% precision in standardizing variant values by
asking a human 100 yes/no questions, which completely outperformed a state of
the art data wrangling tool
- …