6,265 research outputs found
A Practically Efficient Algorithm for Generating Answers to Keyword Search over Data Graphs
In keyword search over a data graph, an answer is a non-redundant subtree
that contains all the keywords of the query. A naive approach to producing all
the answers by increasing height is to generalize Dijkstra's algorithm to
enumerating all acyclic paths by increasing weight. The idea of freezing is
introduced so that (most) non-shortest paths are generated only if they are
actually needed for producing answers. The resulting algorithm for generating
subtrees, called GTF, is subtle and its proof of correctness is intricate.
Extensive experiments show that GTF outperforms existing systems, even ones
that for efficiency's sake are incomplete (i.e., cannot produce all the
answers). In particular, GTF is scalable and performs well even on large data
graphs and when many answers are needed.Comment: Full version of ICDT'16 pape
Distributed Maximum Matching in Bounded Degree Graphs
We present deterministic distributed algorithms for computing approximate
maximum cardinality matchings and approximate maximum weight matchings. Our
algorithm for the unweighted case computes a matching whose size is at least
(1-\eps) times the optimal in \Delta^{O(1/\eps)} +
O\left(\frac{1}{\eps^2}\right) \cdot\log^*(n) rounds where is the number
of vertices in the graph and is the maximum degree. Our algorithm for
the edge-weighted case computes a matching whose weight is at least (1-\eps)
times the optimal in
\log(\min\{1/\wmin,n/\eps\})^{O(1/\eps)}\cdot(\Delta^{O(1/\eps)}+\log^*(n))
rounds for edge-weights in [\wmin,1].
The best previous algorithms for both the unweighted case and the weighted
case are by Lotker, Patt-Shamir, and Pettie~(SPAA 2008). For the unweighted
case they give a randomized (1-\eps)-approximation algorithm that runs in
O((\log(n)) /\eps^3) rounds. For the weighted case they give a randomized
(1/2-\eps)-approximation algorithm that runs in O(\log(\eps^{-1}) \cdot
\log(n)) rounds. Hence, our results improve on the previous ones when the
parameters , \eps and \wmin are constants (where we reduce the
number of runs from to ), and more generally when
, 1/\eps and 1/\wmin are sufficiently slowly increasing functions
of . Moreover, our algorithms are deterministic rather than randomized.Comment: arXiv admin note: substantial text overlap with arXiv:1402.379
TopCom: Index for Shortest Distance Query in Directed Graph
Finding shortest distance between two vertices in a graph is an important
problem due to its numerous applications in diverse domains, including
geo-spatial databases, social network analysis, and information retrieval.
Classical algorithms (such as, Dijkstra) solve this problem in polynomial time,
but these algorithms cannot provide real-time response for a large number of
bursty queries on a large graph. So, indexing based solutions that pre-process
the graph for efficiently answering (exactly or approximately) a large number
of distance queries in real-time is becoming increasingly popular. Existing
solutions have varying performance in terms of index size, index building time,
query time, and accuracy. In this work, we propose T OP C OM , a novel
indexing-based solution for exactly answering distance queries. Our experiments
with two of the existing state-of-the-art methods (IS-Label and TreeMap) show
the superiority of T OP C OM over these two methods considering scalability and
query time. Besides, indexing of T OP C OM exploits the DAG (directed acyclic
graph) structure in the graph, which makes it significantly faster than the
existing methods if the SCCs (strongly connected component) of the input graph
are relatively small
Sequences of regressions and their independences
Ordered sequences of univariate or multivariate regressions provide
statistical models for analysing data from randomized, possibly sequential
interventions, from cohort or multi-wave panel studies, but also from
cross-sectional or retrospective studies. Conditional independences are
captured by what we name regression graphs, provided the generated distribution
shares some properties with a joint Gaussian distribution. Regression graphs
extend purely directed, acyclic graphs by two types of undirected graph, one
type for components of joint responses and the other for components of the
context vector variable. We review the special features and the history of
regression graphs, derive criteria to read all implied independences of a
regression graph and prove criteria for Markov equivalence that is to judge
whether two different graphs imply the same set of independence statements.
Knowledge of Markov equivalence provides alternative interpretations of a given
sequence of regressions, is essential for machine learning strategies and
permits to use the simple graphical criteria of regression graphs on graphs for
which the corresponding criteria are in general more complex. Under the known
conditions that a Markov equivalent directed acyclic graph exists for any given
regression graph, we give a polynomial time algorithm to find one such graph.Comment: 43 pages with 17 figures The manuscript is to appear as an invited
discussion paper in the journal TES
Fault-Tolerant, but Paradoxical Path-Finding in Physical and Conceptual Systems
We report our initial investigations into reliability and path-finding based
models and propose future areas of interest. Inspired by broken sidewalks
during on-campus construction projects, we develop two models for navigating
this "unreliable network." These are based on a concept of "accumulating risk"
backward from the destination, and both operate on directed acyclic graphs with
a probability of failure associated with each edge. The first serves to
introduce and has faults addressed by the second, more conservative model.
Next, we show a paradox when these models are used to construct polynomials on
conceptual networks, such as design processes and software development life
cycles. When the risk of a network increases uniformly, the most reliable path
changes from wider and longer to shorter and narrower. If we let professional
inexperience--such as with entry level cooks and software developers--represent
probability of edge failure, does this change in path imply that the novice
should follow instructions with fewer "back-up" plans, yet those with
alternative routes should be followed by the expert?Comment: 8 page
- …