9,127 research outputs found
Provenance Circuits for Trees and Treelike Instances (Extended Version)
Query evaluation in monadic second-order logic (MSO) is tractable on trees
and treelike instances, even though it is hard for arbitrary instances. This
tractability result has been extended to several tasks related to query
evaluation, such as counting query results [3] or performing query evaluation
on probabilistic trees [10]. These are two examples of the more general problem
of computing augmented query output, that is referred to as provenance. This
article presents a provenance framework for trees and treelike instances, by
describing a linear-time construction of a circuit provenance representation
for MSO queries. We show how this provenance can be connected to the usual
definitions of semiring provenance on relational instances [20], even though we
compute it in an unusual way, using tree automata; we do so via intrinsic
definitions of provenance for general semirings, independent of the operational
details of query evaluation. We show applications of this provenance to capture
existing counting and probabilistic results on trees and treelike instances,
and give novel consequences for probability evaluation.Comment: 48 pages. Presented at ICALP'1
Structurally Tractable Uncertain Data
Many data management applications must deal with data which is uncertain,
incomplete, or noisy. However, on existing uncertain data representations, we
cannot tractably perform the important query evaluation tasks of determining
query possibility, certainty, or probability: these problems are hard on
arbitrary uncertain input instances. We thus ask whether we could restrict the
structure of uncertain data so as to guarantee the tractability of exact query
evaluation. We present our tractability results for tree and tree-like
uncertain data, and a vision for probabilistic rule reasoning. We also study
uncertainty about order, proposing a suitable representation, and study
uncertain data conditioned by additional observations.Comment: 11 pages, 1 figure, 1 table. To appear in SIGMOD/PODS PhD Symposium
201
UPGMpp: a Software Library for Contextual Object Recognition
Object recognition is a cornerstone task towards the scene
understanding problem. Recent works in the field boost their perfor-
mance by incorporating contextual information to the traditional use
of the objects’ geometry and/or appearance. These contextual cues are
usually modeled through Conditional Random Fields (CRFs), a partic-
ular type of undirected Probabilistic Graphical Model (PGM), and are
exploited by means of probabilistic inference methods. In this work we
present the Undirected Probabilistic Graphical Models in C++ library
(UPGMpp), an open source solution for representing, training, and per-
forming inference over undirected PGMs in general, and CRFs in par-
ticular. The UPGMpp library supposes a reliable and comprehensive
workbench for recognition systems exploiting contextual information, in-
cluding a variety of inference methods based on local search, graph cuts,
and message passing approaches. This paper illustrates the virtues of the
library, i.e. it is efficient, comprehensive, versatile, and easy to use, by
presenting a use-case applied to the object recognition problem in home
scenes from the challenging NYU2 dataset.Universidad de Málaga. Campus de Excelencia Internacional AndalucĂa Tech. Spanish grant program FPU-MICINN 2010
and the Spanish projects “TAROTH: New developments toward a robot at
home” (Ref. DPI2011-25483) and “PROMOVE: Advances in mobile robotics
for promoting independent life of elders” (Ref. DPI2014-55826-R
Action Selection for Interaction Management: Opportunities and Lessons for Automated Planning
The central problem in automated planning---action selection---is also a
primary topic in the dialogue systems research community, however, the
nature of research in that community is significantly different from that
of planning, with a focus on end-to-end systems and user evaluations. In
particular, numerous toolkits are available for developing speech-based
dialogue systems that include not only a method for representing states and
actions, but also a mechanism for reasoning and selecting the actions,
often combined with a technical framework designed to simplify the task of
creating end-to-end systems. We contrast this situation with that of
automated planning, and argue that the dialogue systems community could
benefit from some of the directions adopted by the planning community, and
that there also exist opportunities and lessons for automated planning
Quasi-SLCA based Keyword Query Processing over Probabilistic XML Data
The probabilistic threshold query is one of the most common queries in
uncertain databases, where a result satisfying the query must be also with
probability meeting the threshold requirement. In this paper, we investigate
probabilistic threshold keyword queries (PrTKQ) over XML data, which is not
studied before. We first introduce the notion of quasi-SLCA and use it to
represent results for a PrTKQ with the consideration of possible world
semantics. Then we design a probabilistic inverted (PI) index that can be used
to quickly return the qualified answers and filter out the unqualified ones
based on our proposed lower/upper bounds. After that, we propose two efficient
and comparable algorithms: Baseline Algorithm and PI index-based Algorithm. To
accelerate the performance of algorithms, we also utilize probability density
function. An empirical study using real and synthetic data sets has verified
the effectiveness and the efficiency of our approaches
Information Integration - the process of integration, evolution and versioning
At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud
In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources
A document management methodology based on similarity contents
The advent of the WWW and distributed information systems have made it possible to share documents between different users and organisations. However, this has created many problems related to the security, accessibility, right and most importantly the consistency of documents. It is important that the people involved in the documents management process have access to the most up-to-date version of documents, retrieve the correct documents and should be able to update the documents repository in such a way that his or her document are known to others. In this paper we propose a method for organising, storing and retrieving documents based on similarity contents. The method uses techniques based on information retrieval, document indexation and term extraction and indexing. This methodology is developed for the E-Cognos project which aims at developing tools for the management and sharing of documents in the construction domain
Design Environments for Complex Systems
The paper describes an approach for modeling complex systems by hiding as much formal details as possible from the user, still allowing verification and simulation of the model. The interface is based on UML to make the environment available to the largest audience. To carry out analysis, verification and simulation we automatically extract process algebras specifications from UML models. The results of the analysis is then reflected back in the UML model by annotating diagrams. The formal model includes stochastic information to handle quantitative parameters. We present here the stochastic -calculus and we discuss the implementation of its probabilistic support that allows simulation of processes. We exploit the benefits of our approach in two applicative domains: global computing and systems biology
- …