5,317 research outputs found
The accessibility dimension for structured document retrieval
Structured document retrieval aims at retrieving the document components that best satisfy a query, instead of merely retrieving pre-defined document units. This paper reports on an investigation of a tf-idf-acc approach, where tf and idf are the classical term frequency and inverse document frequency, and acc, a new parameter called accessibility, that captures the structure of documents. The tf-idf-acc approach is defined using a probabilistic relational algebra. To investigate the retrieval quality and estimate the acc values, we developed a method that automatically constructs diverse test collections of structured documents from a standard test collection, with which experiments were carried out. The analysis of the experiments provides estimates of the acc values
Bisimilarity is not Borel
We prove that the relation of bisimilarity between countable labelled
transition systems is -complete (hence not Borel), by reducing the
set of non-wellorders over the natural numbers continuously to it.
This has an impact on the theory of probabilistic and nondeterministic
processes over uncountable spaces, since logical characterizations of
bisimilarity (as, for instance, those based on the unique structure theorem for
analytic spaces) require a countable logic whose formulas have measurable
semantics. Our reduction shows that such a logic does not exist in the case of
image-infinite processes.Comment: 20 pages, 1 figure; proof of Sigma_1^1 completeness added with
extended comments. I acknowledge careful reading by the referees. Major
changes in Introduction, Conclusion, and motivation for NLMP. Proof for Lemma
22 added, simpler proofs for Lemma 17 and Theorem 30. Added references. Part
of this work was presented at Dagstuhl Seminar 12411 on Coalgebraic Logic
Semantics, Modelling, and the Problem of Representation of Meaning -- a Brief Survey of Recent Literature
Over the past 50 years many have debated what representation should be used
to capture the meaning of natural language utterances. Recently new needs of
such representations have been raised in research. Here I survey some of the
interesting representations suggested to answer for these new needs.Comment: 15 pages, no figure
ï»żAn Answer Explanation Model for Probabilistic Database Queries
Following the availability of huge amounts of uncertain data, coming from diverse ranges of applications such as sensors, machine learning or mining approaches, information extraction and integration, etc. in recent years, we have seen a revival of interests in probabilistic databases. Queries over these databases result in probabilistic answers. As the process of arriving at these answers is based on the underlying stored uncertain data, we argue that from the standpoint of an end user, it is helpful for such a system to give an explanation on how it arrives at an answer and on which uncertainty assumptions the derived answer is based. In this way, the user with his/her own knowledge can decide how much confidence to place in this probabilistic answer. \ud
The aim of this paper is to design such an answer explanation model for probabilistic database queries. We report our design principles and show the methods to compute the answer explanations. One of the main contributions of our model is that it fills the gap between giving only the answer probability, and giving the full derivation. Furthermore, we show how to balance verifiability and influence of explanation components through the concept of verifiable views. The behavior of the model and its computational efficiency are demonstrated through an extensive performance study
- âŠ