14,792 research outputs found
Reasoning about Explanations for Negative Query Answers in DL-Lite
In order to meet usability requirements, most logic-based applications
provide explanation facilities for reasoning services. This holds also for
Description Logics, where research has focused on the explanation of both TBox
reasoning and, more recently, query answering. Besides explaining the presence
of a tuple in a query answer, it is important to explain also why a given tuple
is missing. We address the latter problem for instance and conjunctive query
answering over DL-Lite ontologies by adopting abductive reasoning; that is, we
look for additions to the ABox that force a given tuple to be in the result. As
reasoning tasks we consider existence and recognition of an explanation, and
relevance and necessity of a given assertion for an explanation. We
characterize the computational complexity of these problems for arbitrary,
subset minimal, and cardinality minimal explanations
Explain3D: Explaining Disagreements in Disjoint Datasets
Data plays an important role in applications, analytic processes, and many
aspects of human activity. As data grows in size and complexity, we are met
with an imperative need for tools that promote understanding and explanations
over data-related operations. Data management research on explanations has
focused on the assumption that data resides in a single dataset, under one
common schema. But the reality of today's data is that it is frequently
un-integrated, coming from different sources with different schemas. When
different datasets provide different answers to semantically similar questions,
understanding the reasons for the discrepancies is challenging and cannot be
handled by the existing single-dataset solutions.
In this paper, we propose Explain3D, a framework for explaining the
disagreements across disjoint datasets (3D). Explain3D focuses on identifying
the reasons for the differences in the results of two semantically similar
queries operating on two datasets with potentially different schemas. Our
framework leverages the queries to perform a semantic mapping across the
relevant parts of their provenance; discrepancies in this mapping point to
causes of the queries' differences. Exploiting the queries gives Explain3D an
edge over traditional schema matching and record linkage techniques, which are
query-agnostic. Our work makes the following contributions: (1) We formalize
the problem of deriving optimal explanations for the differences of the results
of semantically similar queries over disjoint datasets. (2) We design a 3-stage
framework for solving the optimal explanation problem. (3) We develop a
smart-partitioning optimizer that improves the efficiency of the framework by
orders of magnitude. (4)~We experiment with real-world and synthetic data to
demonstrate that Explain3D can derive precise explanations efficiently
Explaining Aggregates for Exploratory Analytics
Analysts wishing to explore multivariate data spaces,
typically pose queries involving selection operators, i.e., range
or radius queries, which define data subspaces of possible
interest and then use aggregation functions, the results of which
determine their exploratory analytics interests. However, such
aggregate query (AQ) results are simple scalars and as such,
convey limited information about the queried subspaces for
exploratory analysis.We address this shortcoming aiding analysts
to explore and understand data subspaces by contributing a novel
explanation mechanism coined XAXA: eXplaining Aggregates for
eXploratory Analytics. XAXA’s novel AQ explanations are represented
using functions obtained by a three-fold joint optimization
problem. Explanations assume the form of a set of parametric
piecewise-linear functions acquired through a statistical learning
model. A key feature of the proposed solution is that model
training is performed by only monitoring AQs and their answers
on-line. In XAXA, explanations for future AQs can be computed
without any database (DB) access and can be used to further
explore the queried data subspaces, without issuing any more
queries to the DB. We evaluate the explanation accuracy and
efficiency of XAXA through theoretically grounded metrics over
real-world and synthetic datasets and query workloads
SWAN: An expert system with natural language interface for tactical air capability assessment
SWAN is an expert system and natural language interface for assessing the war fighting capability of Air Force units in Europe. The expert system is an object oriented knowledge based simulation with an alternate worlds facility for performing what-if excursions. Responses from the system take the form of generated text, tables, or graphs. The natural language interface is an expert system in its own right, with a knowledge base and rules which understand how to access external databases, models, or expert systems. The distinguishing feature of the Air Force expert system is its use of meta-knowledge to generate explanations in the frame and procedure based environment
Finding Top-k Dominance on Incomplete Big Data Using Map-Reduce Framework
Incomplete data is one major kind of multi-dimensional dataset that has random-distributed missing nodes in its dimensions. It is very difficult to retrieve information from this type of dataset when it becomes huge. Finding top-k dominant values in this type of dataset is a challenging procedure. Some algorithms are present to enhance this process but are mostly efficient only when dealing with a small-size incomplete data. One of the algorithms that make the application of TKD query possible is the Bitmap Index Guided (BIG) algorithm. This algorithm strongly improves the performance for incomplete data, but it is not originally capable of finding top-k dominant values in incomplete big data, nor is it designed to do so. Several other algorithms have been proposed to find the TKD query, such as Skyband Based and Upper Bound Based algorithms, but their performance is also questionable. Algorithms developed previously were among the first attempts to apply TKD query on incomplete data; however, all these had weak performances or were not compatible with the incomplete data. This thesis proposes MapReduced Enhanced Bitmap Index Guided Algorithm (MRBIG) for dealing with the aforementioned issues. MRBIG uses the MapReduce framework to enhance the performance of applying top-k dominance queries on huge incomplete datasets. The proposed approach uses the MapReduce parallel computing approach using multiple computing nodes. The framework separates the tasks between several computing nodes that independently and simultaneously work to find the result. This method has achieved up to two times faster processing time in finding the TKD query result in comparison to previously presented algorithms
viSQLizer: Using visualization for learning SQL
Structured Query Language (SQL) is used for interaction between database technology and its users. In higher education, students often struggle with understanding the underlying logic of SQL, thus have trouble with understanding how and why a result table is created from a query. A prototype of a visual learning tool for SQL, viSQLizer, has been developed to determine if visualizations could help students create a mental model and thus enhance their understanding of the underlying logic of SQL. Trough the use of animations and decomposing, our results indicate that visualizations might give students a better understanding of the underlying logic, and that students gain the same learning outcome through visualizations as when using an online tutorial with explanatory text and exercises. Feedback from both professors and students from conducted interviews and experiments indicate that the tool could be used by professors as a visualization tool in lectures, and by students as a practical tool; not as a replacement of, but as an addition to traditional teaching methods
- …