Search CORE

22,820 research outputs found

Deterministic and Probabilistic Binary Search in Graphs

Author: Aslam J. A.
Burnashev M. V.
Dhagat A.
Karp R. M.
Laber E. S.
Mozes S.
Nowak R.
Pedrotti A.
Ulam S. M.
Publication venue
Publication date: 28/07/2017
Field of study

We consider the following natural generalization of Binary Search: in a given undirected, positively weighted graph, one vertex is a target. The algorithm's task is to identify the target by adaptively querying vertices. In response to querying a node

q

, the algorithm learns either that

q

is the target, or is given an edge out of

q

that lies on a shortest path from

q

to the target. We study this problem in a general noisy model in which each query independently receives a correct answer with probability

p > \frac{1}{2}

(a known constant), and an (adversarial) incorrect one with probability

1-p

. Our main positive result is that when

p = 1

(i.e., all answers are correct),

\log_2 n

queries are always sufficient. For general

p

, we give an (almost information-theoretically optimal) algorithm that uses, in expectation, no more than

(1 - \delta)\frac{\log_2 n}{1 - H(p)} + o(\log n) + O(\log^2 (1/\delta))

queries, and identifies the target correctly with probability at leas

1-\delta

. Here,

H(p) = -(p \log p + (1-p) \log(1-p))

denotes the entropy. The first bound is achieved by the algorithm that iteratively queries a 1-median of the nodes not ruled out yet; the second bound by careful repeated invocations of a multiplicative weights algorithm. Even for

p = 1

, we show several hardness results for the problem of determining whether a target can be found using

K

queries. Our upper bound of

\log_2 n

implies a quasipolynomial-time algorithm for undirected connected graphs; we show that this is best-possible under the Strong Exponential Time Hypothesis (SETH). Furthermore, for directed graphs, or for undirected graphs with non-uniform node querying costs, the problem is PSPACE-complete. For a semi-adaptive version, in which one may query

r

nodes each in

k

rounds, we show membership in

\Sigma_{2k-1}

in the polynomial hierarchy, and hardness for

\Sigma_{2k-5}

arXiv.org e-Print Archive

Crossref

Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases

Author: Chen Lei
Wang Guoren
Wang Haixun
Yuan Ye
Publication venue
Publication date: 01/01/2012
Field of study

Many studies have been conducted on seeking the efficient solution for subgraph similarity search over certain (deterministic) graphs due to its wide application in many fields, including bioinformatics, social network analysis, and Resource Description Framework (RDF) data management. All these works assume that the underlying data are certain. However, in reality, graphs are often noisy and uncertain due to various factors, such as errors in data extraction, inconsistencies in data integration, and privacy preserving purposes. Therefore, in this paper, we study subgraph similarity search on large probabilistic graph databases. Different from previous works assuming that edges in an uncertain graph are independent of each other, we study the uncertain graphs where edges' occurrences are correlated. We formally prove that subgraph similarity search over probabilistic graphs is #P-complete, thus, we employ a filter-and-verify framework to speed up the search. In the filtering phase,we develop tight lower and upper bounds of subgraph similarity probability based on a probabilistic matrix index, PMI. PMI is composed of discriminative subgraph features associated with tight lower and upper bounds of subgraph isomorphism probability. Based on PMI, we can sort out a large number of probabilistic graphs and maximize the pruning capability. During the verification phase, we develop an efficient sampling algorithm to validate the remaining candidates. The efficiency of our proposed solutions has been verified through extensive experiments.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Hong Kong University of Science and Technology Institutional Repository

Liveness of Randomised Parameterised Systems under Arbitrary Schedulers (Technical Report)

Author: Lin Anthony W.
Ruemmer Philipp
Publication venue
Publication date: 01/01/2016
Field of study

We consider the problem of verifying liveness for systems with a finite, but unbounded, number of processes, commonly known as parameterised systems. Typical examples of such systems include distributed protocols (e.g. for the dining philosopher problem). Unlike the case of verifying safety, proving liveness is still considered extremely challenging, especially in the presence of randomness in the system. In this paper we consider liveness under arbitrary (including unfair) schedulers, which is often considered a desirable property in the literature of self-stabilising systems. We introduce an automatic method of proving liveness for randomised parameterised systems under arbitrary schedulers. Viewing liveness as a two-player reachability game (between Scheduler and Process), our method is a CEGAR approach that synthesises a progress relation for Process that can be symbolically represented as a finite-state automaton. The method is incremental and exploits both Angluin-style L*-learning and SAT-solvers. Our experiments show that our algorithm is able to prove liveness automatically for well-known randomised distributed protocols, including Lehmann-Rabin Randomised Dining Philosopher Protocol and randomised self-stabilising protocols (such as the Israeli-Jalfon Protocol). To the best of our knowledge, this is the first fully-automatic method that can prove liveness for randomised protocols.Comment: Full version of CAV'16 pape

arXiv.org e-Print Archive

Crossref

Publikationer från Uppsala Universitet

Oxford University Research Archive

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Integrating and Ranking Uncertain Scientific Data

Author: Detwiler Landon T
Gatterbauer Wolfgang
Louie Brenton
Suciu Dan
Tarczy-Hornoch Peter
Publication venue
Publication date: 01/01/2008
Field of study

Mediator-based data integration systems resolve exploratory queries by joining data elements across sources. In the presence of uncertainties, such multiple expansions can quickly lead to spurious connections and incorrect results. The BioRank project investigates formalisms for modeling uncertainty during scientific data integration and for ranking uncertain query results. Our motivating application is protein function prediction. In this paper we show that: (i) explicit modeling of uncertainties as probabilities increases our ability to predict less-known or previously unknown functions (though it does not improve predicting the well-known). This suggests that probabilistic uncertainty models offer utility for scientific knowledge discovery; (ii) small perturbations in the input probabilities tend to produce only minor changes in the quality of our result rankings. This suggests that our methods are robust against slight variations in the way uncertainties are transformed into probabilities; and (iii) several techniques allow us to evaluate our probabilistic rankings efficiently. This suggests that probabilistic query evaluation is not as hard for real-world problems as theory indicates

CiteSeerX

Crossref

University of Washington Structural Informatics Group Publications