3,706 research outputs found
Simulation Subsumption or DĂ©jĂ vu on the Web
Simulation unification is a special kind of unification adapted to retrieving semi-structured data on the Web. This article introduces simulation subsumption, or containment, that is, query subsumption under simulation unification. Simulation subsumption is crucial in general for query optimization, in particular for optimizing pattern-based search engines, and for the termination of recursive rule-based web languages such as the XML and RDF query language Xcerpt. This paper first motivates and formalizes simulation subsumption. Then, it establishes decidability of simulation subsumption for advanced query patterns featuring descendant constructs, regular expressions, negative subterms (or subterm exclusions), and multiple variable occurrences. Finally, we show that subsumption between two query terms can be decided in O(n!n) where n is the sum of the sizes of both query terms
Triggered Clause Pushing for IC3
We propose an improvement of the famous IC3 algorithm for model checking
safety properties of finite state systems. We collect models computed by the
SAT-solver during the clause propagation phase of the algorithm and use them as
witnesses for why the respective clauses could not be pushed forward. It only
makes sense to recheck a particular clause for pushing when its witnessing
model falsifies a newly added clause. Since this trigger test is both
computationally cheap and sufficiently precise, we can afford to keep clauses
pushed as far as possible at all times. Experiments indicate that this strategy
considerably improves IC3's performance.Comment: 4 page
Concept-based Interactive Query Expansion Support Tool (CIQUEST)
This report describes a three-year project (2000-03) undertaken in the Information Studies
Department at The University of Sheffield and funded by Resource, The Council for
Museums, Archives and Libraries. The overall aim of the research was to provide user
support for query formulation and reformulation in searching large-scale textual resources
including those of the World Wide Web. More specifically the objectives were: to investigate
and evaluate methods for the automatic generation and organisation of concepts derived from
retrieved document sets, based on statistical methods for term weighting; and to conduct
user-based evaluations on the understanding, presentation and retrieval effectiveness of
concept structures in selecting candidate terms for interactive query expansion.
The TREC test collection formed the basis for the seven evaluative experiments conducted in
the course of the project. These formed four distinct phases in the project plan. In the first
phase, a series of experiments was conducted to investigate further techniques for concept
derivation and hierarchical organisation and structure. The second phase was concerned with
user-based validation of the concept structures. Results of phases 1 and 2 informed on the
design of the test system and the user interface was developed in phase 3. The final phase
entailed a user-based summative evaluation of the CiQuest system.
The main findings demonstrate that concept hierarchies can effectively be generated from
sets of retrieved documents and displayed to searchers in a meaningful way. The approach
provides the searcher with an overview of the contents of the retrieved documents, which in
turn facilitates the viewing of documents and selection of the most relevant ones. Concept
hierarchies are a good source of terms for query expansion and can improve precision. The
extraction of descriptive phrases as an alternative source of terms was also effective. With
respect to presentation, cascading menus were easy to browse for selecting terms and for
viewing documents. In conclusion the project dissemination programme and future work are
outlined
The PITA System: Tabling and Answer Subsumption for Reasoning under Uncertainty
Many real world domains require the representation of a measure of
uncertainty. The most common such representation is probability, and the
combination of probability with logic programs has given rise to the field of
Probabilistic Logic Programming (PLP), leading to languages such as the
Independent Choice Logic, Logic Programs with Annotated Disjunctions (LPADs),
Problog, PRISM and others. These languages share a similar distribution
semantics, and methods have been devised to translate programs between these
languages. The complexity of computing the probability of queries to these
general PLP programs is very high due to the need to combine the probabilities
of explanations that may not be exclusive. As one alternative, the PRISM system
reduces the complexity of query answering by restricting the form of programs
it can evaluate. As an entirely different alternative, Possibilistic Logic
Programs adopt a simpler metric of uncertainty than probability. Each of these
approaches -- general PLP, restricted PLP, and Possibilistic Logic Programming
-- can be useful in different domains depending on the form of uncertainty to
be represented, on the form of programs needed to model problems, and on the
scale of the problems to be solved. In this paper, we show how the PITA system,
which originally supported the general PLP language of LPADs, can also
efficiently support restricted PLP and Possibilistic Logic Programs. PITA
relies on tabling with answer subsumption and consists of a transformation
along with an API for library functions that interface with answer subsumption
A Database Interface for Complex Objects
We describe a formal design for a logical query language using psi-terms as data structures to interact effectively and efficiently with a relational database. The structure of psi-terms provides an adequate representation for so-called complex objects. They generalize conventional terms used in logic programming: they are typed attributed structures, ordered thanks to a subtype ordering. Unification of psi-terms is an effective means for integrating multiple inheritance and partial information into a deduction process. We define a compact database representation for psi-terms, representing part of the subtyping relation in the database as well. We describe a retrieval algorithm based on an abstract interpretation of the psi-term unification process and prove its formal correctness. This algorithm is efficient in that it incrementally retrieves only additional facts that are actually needed by a query, and never retrieves the same fact twice
LiteMat: a scalable, cost-efficient inference encoding scheme for large RDF graphs
The number of linked data sources and the size of the linked open data graph
keep growing every day. As a consequence, semantic RDF services are more and
more confronted with various "big data" problems. Query processing in the
presence of inferences is one them. For instance, to complete the answer set of
SPARQL queries, RDF database systems evaluate semantic RDFS relationships
(subPropertyOf, subClassOf) through time-consuming query rewriting algorithms
or space-consuming data materialization solutions. To reduce the memory
footprint and ease the exchange of large datasets, these systems generally
apply a dictionary approach for compressing triple data sizes by replacing
resource identifiers (IRIs), blank nodes and literals with integer values. In
this article, we present a structured resource identification scheme using a
clever encoding of concepts and property hierarchies for efficiently evaluating
the main common RDFS entailment rules while minimizing triple materialization
and query rewriting. We will show how this encoding can be computed by a
scalable parallel algorithm and directly be implemented over the Apache Spark
framework. The efficiency of our encoding scheme is emphasized by an evaluation
conducted over both synthetic and real world datasets.Comment: 8 pages, 1 figur
- …