5,488 research outputs found
Deterministic and Probabilistic Binary Search in Graphs
We consider the following natural generalization of Binary Search: in a given
undirected, positively weighted graph, one vertex is a target. The algorithm's
task is to identify the target by adaptively querying vertices. In response to
querying a node , the algorithm learns either that is the target, or is
given an edge out of that lies on a shortest path from to the target.
We study this problem in a general noisy model in which each query
independently receives a correct answer with probability (a
known constant), and an (adversarial) incorrect one with probability .
Our main positive result is that when (i.e., all answers are
correct), queries are always sufficient. For general , we give an
(almost information-theoretically optimal) algorithm that uses, in expectation,
no more than queries, and identifies the target correctly with probability at
leas . Here, denotes the
entropy. The first bound is achieved by the algorithm that iteratively queries
a 1-median of the nodes not ruled out yet; the second bound by careful repeated
invocations of a multiplicative weights algorithm.
Even for , we show several hardness results for the problem of
determining whether a target can be found using queries. Our upper bound of
implies a quasipolynomial-time algorithm for undirected connected
graphs; we show that this is best-possible under the Strong Exponential Time
Hypothesis (SETH). Furthermore, for directed graphs, or for undirected graphs
with non-uniform node querying costs, the problem is PSPACE-complete. For a
semi-adaptive version, in which one may query nodes each in rounds, we
show membership in in the polynomial hierarchy, and hardness
for
Correlation Clustering with Same-Cluster Queries Bounded by Optimal Cost
Several clustering frameworks with interactive (semi-supervised) queries have been studied in the past. Recently, clustering with same-cluster queries has become popular. An algorithm in this setting has access to an oracle with full knowledge of an optimal clustering, and the algorithm can ask the oracle queries of the form, "Does the optimal clustering put vertices u and v in the same cluster?" Due to its simplicity, this querying model can easily be implemented in real crowd-sourcing platforms and has attracted a lot of recent work.
In this paper, we study the popular correlation clustering problem (Bansal et al., 2002) under the same-cluster querying framework. Given a complete graph G=(V,E) with positive and negative edge labels, correlation clustering objective aims to compute a graph clustering that minimizes the total number of disagreements, that is the negative intra-cluster edges and positive inter-cluster edges. In a recent work, Ailon et al. (2018b) provided an approximation algorithm for correlation clustering that approximates the correlation clustering objective within (1+epsilon) with O((k^{14} log{n} log{k})/epsilon^6) queries when the number of clusters, k, is fixed. For many applications, k is not fixed and can grow with |V|. Moreover, the dependency of k^14 on query complexity renders the algorithm impractical even for datasets with small values of k.
In this paper, we take a different approach. Let C_{OPT} be the number of disagreements made by the optimal clustering. We present algorithms for correlation clustering whose error and query bounds are parameterized by C_{OPT} rather than by the number of clusters. Indeed, a good clustering must have small C_{OPT}. Specifically, we present an efficient algorithm that recovers an exact optimal clustering using at most 2C_{OPT} queries and an efficient algorithm that outputs a 2-approximation using at most C_{OPT} queries. In addition, we show under a plausible complexity assumption, there does not exist any polynomial time algorithm that has an approximation ratio better than 1+alpha for an absolute constant alpha > 0 with o(C_{OPT}) queries. Therefore, our first algorithm achieves the optimal query bound within a factor of 2.
We extensively evaluate our methods on several synthetic and real-world datasets using real crowd-sourced oracles. Moreover, we compare our approach against known correlation clustering algorithms that do not perform querying. In all cases, our algorithms exhibit superior performance
The Query-commit Problem
In the query-commit problem we are given a graph where edges have distinct
probabilities of existing. It is possible to query the edges of the graph, and
if the queried edge exists then its endpoints are irrevocably matched. The goal
is to find a querying strategy which maximizes the expected size of the
matching obtained. This stochastic matching setup is motivated by applications
in kidney exchanges and online dating.
In this paper we address the query-commit problem from both theoretical and
experimental perspectives. First, we show that a simple class of edges can be
queried without compromising the optimality of the strategy. This property is
then used to obtain in polynomial time an optimal querying strategy when the
input graph is sparse. Next we turn our attentions to the kidney exchange
application, focusing on instances modeled over real data from existing
exchange programs. We prove that, as the number of nodes grows, almost every
instance admits a strategy which matches almost all nodes. This result supports
the intuition that more exchanges are possible on a larger pool of
patient/donors and gives theoretical justification for unifying the existing
exchange programs. Finally, we evaluate experimentally different querying
strategies over kidney exchange instances. We show that even very simple
heuristics perform fairly well, being within 1.5% of an optimal clairvoyant
strategy, that knows in advance the edges in the graph. In such a
time-sensitive application, this result motivates the use of committing
strategies
Lower Bounds in the Preprocessing and Query Phases of Routing Algorithms
In the last decade, there has been a substantial amount of research in
finding routing algorithms designed specifically to run on real-world graphs.
In 2010, Abraham et al. showed upper bounds on the query time in terms of a
graph's highway dimension and diameter for the current fastest routing
algorithms, including contraction hierarchies, transit node routing, and hub
labeling. In this paper, we show corresponding lower bounds for the same three
algorithms. We also show how to improve a result by Milosavljevic which lower
bounds the number of shortcuts added in the preprocessing stage for contraction
hierarchies. We relax the assumption of an optimal contraction order (which is
NP-hard to compute), allowing the result to be applicable to real-world
instances. Finally, we give a proof that optimal preprocessing for hub labeling
is NP-hard. Hardness of optimal preprocessing is known for most routing
algorithms, and was suspected to be true for hub labeling
Improved Parallel Algorithms for Spanners and Hopsets
We use exponential start time clustering to design faster and more
work-efficient parallel graph algorithms involving distances. Previous
algorithms usually rely on graph decomposition routines with strict
restrictions on the diameters of the decomposed pieces. We weaken these bounds
in favor of stronger local probabilistic guarantees. This allows more direct
analyses of the overall process, giving: * Linear work parallel algorithms that
construct spanners with stretch and size in unweighted
graphs, and size in weighted graphs. * Hopsets that lead
to the first parallel algorithm for approximating shortest paths in undirected
graphs with work
Complexity of coalition structure generation
We revisit the coalition structure generation problem in which the goal is to
partition the players into exhaustive and disjoint coalitions so as to maximize
the social welfare. One of our key results is a general polynomial-time
algorithm to solve the problem for all coalitional games provided that player
types are known and the number of player types is bounded by a constant. As a
corollary, we obtain a polynomial-time algorithm to compute an optimal
partition for weighted voting games with a constant number of weight values and
for coalitional skill games with a constant number of skills. We also consider
well-studied and well-motivated coalitional games defined compactly on
combinatorial domains. For these games, we characterize the complexity of
computing an optimal coalition structure by presenting polynomial-time
algorithms, approximation algorithms, or NP-hardness and inapproximability
lower bounds.Comment: 17 page
- …