119 research outputs found
A Practically Efficient Algorithm for Generating Answers to Keyword Search over Data Graphs
In keyword search over a data graph, an answer is a non-redundant subtree
that contains all the keywords of the query. A naive approach to producing all
the answers by increasing height is to generalize Dijkstra's algorithm to
enumerating all acyclic paths by increasing weight. The idea of freezing is
introduced so that (most) non-shortest paths are generated only if they are
actually needed for producing answers. The resulting algorithm for generating
subtrees, called GTF, is subtle and its proof of correctness is intricate.
Extensive experiments show that GTF outperforms existing systems, even ones
that for efficiency's sake are incomplete (i.e., cannot produce all the
answers). In particular, GTF is scalable and performs well even on large data
graphs and when many answers are needed.Comment: Full version of ICDT'16 pape
Automatic Termination Analysis of Programs Containing Arithmetic Predicates
For logic programs with arithmetic predicates, showing termination is not
easy, since the usual order for the integers is not well-founded. A new method,
easily incorporated in the TermiLog system for automatic termination analysis,
is presented for showing termination in this case.
The method consists of the following steps: First, a finite abstract domain
for representing the range of integers is deduced automatically. Based on this
abstraction, abstract interpretation is applied to the program. The result is a
finite number of atoms abstracting answers to queries which are used to extend
the technique of query-mapping pairs. For each query-mapping pair that is
potentially non-terminating, a bounded (integer-valued) termination function is
guessed. If traversing the pair decreases the value of the termination
function, then termination is established. Simple functions often suffice for
each query-mapping pair, and that gives our approach an edge over the classical
approach of using a single termination function for all loops, which must
inevitably be more complicated and harder to guess automatically. It is worth
noting that the termination of McCarthy's 91 function can be shown
automatically using our method.
In summary, the proposed approach is based on combining a finite abstraction
of the integers with the technique of the query-mapping pairs, and is
essentially capable of dividing a termination proof into several cases, such
that a simple termination function suffices for each case. Consequently, the
whole process of proving termination can be done automatically in the framework
of TermiLog and similar systems.Comment: Appeared also in Electronic Notes in Computer Science vol. 3
A General Framework for Automatic Termination Analysis of Logic Programs
This paper describes a general framework for automatic termination analysis
of logic programs, where we understand by ``termination'' the finitenes s of
the LD-tree constructed for the program and a given query. A general property
of mappings from a certain subset of the branches of an infinite LD-tree into a
finite set is proved. From this result several termination theorems are
derived, by using different finite sets. The first two are formulated for the
predicate dependency and atom dependency graphs. Then a general result for the
case of the query-mapping pairs relevant to a program is proved (cf.
\cite{Sagiv,Lindenstrauss:Sagiv}). The correctness of the {\em TermiLog} system
described in \cite{Lindenstrauss:Sagiv:Serebrenik} follows from it. In this
system it is not possible to prove termination for programs involving
arithmetic predicates, since the usual order for the integers is not
well-founded. A new method, which can be easily incorporated in {\em TermiLog}
or similar systems, is presented, which makes it possible to prove termination
for programs involving arithmetic predicates. It is based on combining a finite
abstraction of the integers with the technique of the query-mapping pairs, and
is essentially capable of dividing a termination proof into several cases, such
that a simple termination function suffices for each case. Finally several
possible extensions are outlined
An incremental algorithm for computing ranked full disjunctions
AbstractThe full disjunction is a variation of the join operator that maximally combines tuples from connected relations, while preserving all information in the relations. The full disjunction can be seen as a natural extension of the binary outerjoin operator to an arbitrary number of relations and is a useful operator for information integration. This paper presents the algorithm IncrementalFD for computing the full disjunction of a set of relations. IncrementalFD improves upon previous algorithms for computing the full disjunction in four ways. First, it has a lower total runtime when computing the full result and a lower runtime when computing only k tuples of the result, for any constant k. Second, for a natural class of ranking functions, IncrementalFD can be adapted to return tuples in ranking order. Third, a variation of IncrementalFD can be used to return approximate full disjunctions (which contain maximal approximately join consistent tuples). Fourth, IncrementalFD can be adapted to have a block-based execution, instead of a tuple-based execution
Validating constraints with partial information: research overview
We are interested in the problem of validating the consistency of integrity constraints when
data is modified. In particular, we consider how constraints can be checked with only "partial
information". Partial information may include: (1) the constraint specifications only, (2) the
constraint specifications and the modified data, or (3) the constraint specifications, the modified
data, and portions ofthe existing data. Methods for constraint checking with partĂa! Ănformation
can be much more efficient than traditional constraint checking methods ( e.g. because work is
done at compile time, or because less data is accessed). Partial information methods also
enable constraint checking in scenarios where tradĂtĂonal constraint checking methods fail (e.g.
in distributed environments where not all data is accessible). We explain how existing methods
and results for query containment and for independeñce can be applied to problems (1) and (2)
above, and we give an overview of our research into problem (3)
Representing and Integrating Multiple Calendars
Whenever humans refer to time, they do so with respect to a
specific underlying calendar. So do most software applications.
However, most theoretical
models of time refer to time with respect to the integers (or reals).
Thus, there is a mismatch between the theory and the application of
temporal reasoning.
To lessen this gap, we propose a formal, theoretical definition of a
calendar and show how one may specify dates, time points, time
intervals, as well as sets of time points, in terms of constraints
with respect to a given calendar. Furthermore, when multiple
applications using different calendars wish to work together, there is
a need to integrate those calendars together into a single, unified
calendar. We show how this can be done.
(Also cross-referenced as UMIACS-TR-97-12
Semantic Query Optimization in Datalog Programs (Extended Abstract)
) Alon Y. Levy AT&T Bell Laboratories [email protected] Yehoshua Sagiv Hebrew University, Jerusalem [email protected] Abstract Semantic query optimization refers to the process of using integrity constraints (ic's) in order to optimize the evaluation of queries. The process is well understood in the case of unions of select-project-join queries (i.e., nonrecursive datalog). For arbitrary datalog programs, however, the issue has largely remained an unsolved problem. This paper studies this problem and shows when semantic query optimization can be completely done in recursive rules provided that order constraints and negated EDB subgoals appear only in the recursive rules, but not in the ic's. If either order constraints or negated EDB subgoals are introduced in ic's, then the problem of semantic query optimization becomes undecidable. Since semantic query optimization is closely related to the containment problem of a datalog program in a union of conjunctive queries, our res..
- …