30,628 research outputs found
Deterministic and Probabilistic Binary Search in Graphs
We consider the following natural generalization of Binary Search: in a given
undirected, positively weighted graph, one vertex is a target. The algorithm's
task is to identify the target by adaptively querying vertices. In response to
querying a node , the algorithm learns either that is the target, or is
given an edge out of that lies on a shortest path from to the target.
We study this problem in a general noisy model in which each query
independently receives a correct answer with probability (a
known constant), and an (adversarial) incorrect one with probability .
Our main positive result is that when (i.e., all answers are
correct), queries are always sufficient. For general , we give an
(almost information-theoretically optimal) algorithm that uses, in expectation,
no more than queries, and identifies the target correctly with probability at
leas . Here, denotes the
entropy. The first bound is achieved by the algorithm that iteratively queries
a 1-median of the nodes not ruled out yet; the second bound by careful repeated
invocations of a multiplicative weights algorithm.
Even for , we show several hardness results for the problem of
determining whether a target can be found using queries. Our upper bound of
implies a quasipolynomial-time algorithm for undirected connected
graphs; we show that this is best-possible under the Strong Exponential Time
Hypothesis (SETH). Furthermore, for directed graphs, or for undirected graphs
with non-uniform node querying costs, the problem is PSPACE-complete. For a
semi-adaptive version, in which one may query nodes each in rounds, we
show membership in in the polynomial hierarchy, and hardness
for
Prioritizing Populations for Conservation Using Phylogenetic Networks
In the face of inevitable future losses to biodiversity, ranking species by conservation priority seems more than prudent. Setting conservation priorities within species (i.e., at the population level) may be critical as species ranges become fragmented and connectivity declines. However, existing approaches to prioritization (e.g., scoring organisms by their expected genetic contribution) are based on phylogenetic trees, which may be poor representations of differentiation below the species level. In this paper we extend evolutionary isolation indices used in conservation planning from phylogenetic trees to phylogenetic networks. Such networks better represent population differentiation, and our extension allows populations to be ranked in order of their expected contribution to the set. We illustrate the approach using data from two imperiled species: the spotted owl Strix occidentalis in North America and the mountain pygmy-possum Burramys parvus in Australia. Using previously published mitochondrial and microsatellite data, we construct phylogenetic networks and score each population by its relative genetic distinctiveness. In both cases, our phylogenetic networks capture the geographic structure of each species: geographically peripheral populations harbor less-redundant genetic information, increasing their conservation rankings. We note that our approach can be used with all conservation-relevant distances (e.g., those based on whole-genome, ecological, or adaptive variation) and suggest it be added to the assortment of tools available to wildlife managers for allocating effort among threatened populations
Answering Complex Questions by Joining Multi-Document Evidence with Quasi Knowledge Graphs
Direct answering of questions that involve multiple entities and relations is a challenge for text-based QA. This problem is most pronounced when answers can be found only by joining evidence from multiple documents. Curated knowledge graphs (KGs) may yield good answers, but are limited by their inherent incompleteness and potential staleness. This paper presents QUEST, a method that can answer complex questions directly from textual sources on-the-fly, by computing similarity joins over partial results from different documents. Our method is completely unsupervised, avoiding training-data bottlenecks and being able to cope with rapidly evolving ad hoc topics and formulation style in user questions. QUEST builds a noisy quasi KG with node and edge weights, consisting of dynamically retrieved entity names and relational phrases. It augments this graph with types and semantic alignments, and computes the best answers by an algorithm for Group Steiner Trees. We evaluate QUEST on benchmarks of complex questions, and show that it substantially outperforms state-of-the-art baselines
Recommended from our members
An O(n3 [square root of] log n) algorithm for the optimal stable marriage problem
We give an O(n^3 √logn) time algorithm for the optimal stable marriage problem. This algorithm finds a stable marriage that minimizes an objective function defined over all stable marriages in a given problem instance.Irving, Leather, and Gusfield have previously provided a solution to this problem that runs in O(n^4) time [ILG87]. In addition, Feder has claimed that an O(n^3 log n) time algorithm exists [F89]. Our result is an asymptotic improvement over both cases.As part of our solution, we solve a special blue-red matching problem, and illustrate a technique for simulating Hopcroft and Karp's maximum-matching algorithm [HK73] on the transitive closure of a graph
Term-Specific Eigenvector-Centrality in Multi-Relation Networks
Fuzzy matching and ranking are two information retrieval techniques widely used in web search. Their application to structured data, however, remains an open problem. This article investigates how eigenvector-centrality can be used for approximate matching in multi-relation graphs, that is, graphs where connections of many different types may exist. Based on an extension of the PageRank matrix, eigenvectors representing the distribution of a term after propagating term weights between related data items are computed. The result is an index which takes the document structure into account and can be used with standard document retrieval techniques. As the scheme takes the shape of an index transformation, all necessary calculations are performed during index tim
- …