2,019 research outputs found
Deterministic and Probabilistic Binary Search in Graphs
We consider the following natural generalization of Binary Search: in a given
undirected, positively weighted graph, one vertex is a target. The algorithm's
task is to identify the target by adaptively querying vertices. In response to
querying a node , the algorithm learns either that is the target, or is
given an edge out of that lies on a shortest path from to the target.
We study this problem in a general noisy model in which each query
independently receives a correct answer with probability (a
known constant), and an (adversarial) incorrect one with probability .
Our main positive result is that when (i.e., all answers are
correct), queries are always sufficient. For general , we give an
(almost information-theoretically optimal) algorithm that uses, in expectation,
no more than queries, and identifies the target correctly with probability at
leas . Here, denotes the
entropy. The first bound is achieved by the algorithm that iteratively queries
a 1-median of the nodes not ruled out yet; the second bound by careful repeated
invocations of a multiplicative weights algorithm.
Even for , we show several hardness results for the problem of
determining whether a target can be found using queries. Our upper bound of
implies a quasipolynomial-time algorithm for undirected connected
graphs; we show that this is best-possible under the Strong Exponential Time
Hypothesis (SETH). Furthermore, for directed graphs, or for undirected graphs
with non-uniform node querying costs, the problem is PSPACE-complete. For a
semi-adaptive version, in which one may query nodes each in rounds, we
show membership in in the polynomial hierarchy, and hardness
for
S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification
This paper investigates the problem of active learning for binary label
prediction on a graph. We introduce a simple and label-efficient algorithm
called S2 for this task. At each step, S2 selects the vertex to be labeled
based on the structure of the graph and all previously gathered labels.
Specifically, S2 queries for the label of the vertex that bisects the *shortest
shortest* path between any pair of oppositely labeled vertices. We present a
theoretical estimate of the number of queries S2 needs in terms of a novel
parametrization of the complexity of binary functions on graphs. We also
present experimental results demonstrating the performance of S2 on both real
and synthetic data. While other graph-based active learning algorithms have
shown promise in practice, our algorithm is the first with both good
performance and theoretical guarantees. Finally, we demonstrate the
implications of the S2 algorithm to the theory of nonparametric active
learning. In particular, we show that S2 achieves near minimax optimal excess
risk for an important class of nonparametric classification problems.Comment: A version of this paper appears in the Conference on Learning Theory
(COLT) 201
Simpler, faster and shorter labels for distances in graphs
We consider how to assign labels to any undirected graph with n nodes such
that, given the labels of two nodes and no other information regarding the
graph, it is possible to determine the distance between the two nodes. The
challenge in such a distance labeling scheme is primarily to minimize the
maximum label lenght and secondarily to minimize the time needed to answer
distance queries (decoding). Previous schemes have offered different trade-offs
between label lengths and query time. This paper presents a simple algorithm
with shorter labels and shorter query time than any previous solution, thereby
improving the state-of-the-art with respect to both label length and query time
in one single algorithm. Our solution addresses several open problems
concerning label length and decoding time and is the first improvement of label
length for more than three decades.
More specifically, we present a distance labeling scheme with label size (log
3)/2 + o(n) (logarithms are in base 2) and O(1) decoding time. This outperforms
all existing results with respect to both size and decoding time, including
Winkler's (Combinatorica 1983) decade-old result, which uses labels of size
(log 3)n and O(n/log n) decoding time, and Gavoille et al. (SODA'01), which
uses labels of size 11n + o(n) and O(loglog n) decoding time. In addition, our
algorithm is simpler than the previous ones. In the case of integral edge
weights of size at most W, we present almost matching upper and lower bounds
for label sizes. For r-additive approximation schemes, where distances can be
off by an additive constant r, we give both upper and lower bounds. In
particular, we present an upper bound for 1-additive approximation schemes
which, in the unweighted case, has the same size (ignoring second order terms)
as an adjacency scheme: n/2. We also give results for bipartite graphs and for
exact and 1-additive distance oracles
A Linear-Size Logarithmic Stretch Path-Reporting Distance Oracle for General Graphs
In 2001 Thorup and Zwick devised a distance oracle, which given an -vertex
undirected graph and a parameter , has size . Upon a query
their oracle constructs a -approximate path between
and . The query time of the Thorup-Zwick's oracle is , and it was
subsequently improved to by Chechik. A major drawback of the oracle of
Thorup and Zwick is that its space is . Mendel and Naor
devised an oracle with space and stretch , but their
oracle can only report distance estimates and not actual paths. In this paper
we devise a path-reporting distance oracle with size , stretch
and query time , for an arbitrarily small .
In particular, our oracle can provide logarithmic stretch using linear size.
Another variant of our oracle has size , polylogarithmic
stretch, and query time .
For unweighted graphs we devise a distance oracle with multiplicative stretch
, additive stretch , for a function , space
, and query time , for an arbitrarily
small constant . The tradeoff between multiplicative stretch and
size in these oracles is far below girth conjecture threshold (which is stretch
and size ). Breaking the girth conjecture tradeoff is
achieved by exhibiting a tradeoff of different nature between additive stretch
and size . A similar type of tradeoff was exhibited by
a construction of -spanners due to Elkin and Peleg.
However, so far -spanners had no counterpart in the
distance oracles' world.
An important novel tool that we develop on the way to these results is a
{distance-preserving path-reporting oracle}
A Framework for Searching in Graphs in the Presence of Errors
We consider a problem of searching for an unknown target vertex t in a (possibly edge-weighted) graph. Each vertex-query points to a vertex v and the response either admits that v is the target or provides any neighbor s of v that lies on a shortest path from v to t. This model has been introduced for trees by Onak and Parys [FOCS 2006] and for general graphs by Emamjomeh-Zadeh et al. [STOC 2016]. In the latter, the authors provide algorithms for the error-less case and for the independent noise model (where each query independently receives an erroneous answer with known probability p<1/2 and a correct one with probability 1-p).
We study this problem both with adversarial errors and independent noise models. First, we show an algorithm that needs at most (log_2 n)/(1 - H(r)) queries in case of adversarial errors, where the adversary is bounded with its rate of errors by a known constant r<1/2. Our algorithm is in fact a simplification of previous work, and our refinement lies in invoking an amortization argument. We then show that our algorithm coupled with a Chernoff bound argument leads to a simpler algorithm for the independent noise model and has a query complexity that is both simpler and asymptotically better than the one of Emamjomeh-Zadeh et al. [STOC 2016].
Our approach has a wide range of applications. First, it improves and simplifies the Robust Interactive Learning framework proposed by Emamjomeh-Zadeh and Kempe [NIPS 2017]. Secondly, performing analogous analysis for edge-queries (where a query to an edge e returns its endpoint that is closer to the target) we actually recover (as a special case) a noisy binary search algorithm that is asymptotically optimal, matching the complexity of Feige et al. [SIAM J. Comput. 1994]. Thirdly, we improve and simplify upon an algorithm for searching of unbounded domains due to Aslam and Dhagat [STOC 1991]
Reasoning & Querying – State of the Art
Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF
Four Lessons in Versatility or How Query Languages Adapt to the Web
Exposing not only human-centered information, but machine-processable data on the Web is one of the commonalities of recent Web trends. It has enabled a new kind of applications and businesses where the data is used in ways not foreseen by the data providers. Yet this exposition has fractured the Web into islands of data, each in different Web formats: Some providers choose XML, others RDF, again others JSON or OWL, for their data, even in similar domains. This fracturing stifles innovation as application builders have to cope not only with one Web stack (e.g., XML technology) but with several ones, each of considerable complexity. With Xcerpt we have developed a rule- and pattern based query language that aims to give shield application builders from much of this complexity: In a single query language XML and RDF data can be accessed, processed, combined, and re-published. Though the need for combined access to XML and RDF data has been recognized in previous work (including the W3C’s GRDDL), our approach differs in four main aspects: (1) We provide a single language (rather than two separate or embedded languages), thus minimizing the conceptual overhead of dealing with disparate data formats. (2) Both the declarative (logic-based) and the operational semantics are unified in that they apply for querying XML and RDF in the same way. (3) We show that the resulting query language can be implemented reusing traditional database technology, if desirable. Nevertheless, we also give a unified evaluation approach based on interval labelings of graphs that is at least as fast as existing approaches for tree-shaped XML data, yet provides linear time and space querying also for many RDF graphs. We believe that Web query languages are the right tool for declarative data access in Web applications and that Xcerpt is a significant step towards a more convenient, yet highly efficient data access in a “Web of Data”
- …