10,547 research outputs found
Finding Top-k Dominance on Incomplete Big Data Using Map-Reduce Framework
Incomplete data is one major kind of multi-dimensional dataset that has random-distributed missing nodes in its dimensions. It is very difficult to retrieve information from this type of dataset when it becomes huge. Finding top-k dominant values in this type of dataset is a challenging procedure. Some algorithms are present to enhance this process but are mostly efficient only when dealing with a small-size incomplete data. One of the algorithms that make the application of TKD query possible is the Bitmap Index Guided (BIG) algorithm. This algorithm strongly improves the performance for incomplete data, but it is not originally capable of finding top-k dominant values in incomplete big data, nor is it designed to do so. Several other algorithms have been proposed to find the TKD query, such as Skyband Based and Upper Bound Based algorithms, but their performance is also questionable. Algorithms developed previously were among the first attempts to apply TKD query on incomplete data; however, all these had weak performances or were not compatible with the incomplete data. This thesis proposes MapReduced Enhanced Bitmap Index Guided Algorithm (MRBIG) for dealing with the aforementioned issues. MRBIG uses the MapReduce framework to enhance the performance of applying top-k dominance queries on huge incomplete datasets. The proposed approach uses the MapReduce parallel computing approach using multiple computing nodes. The framework separates the tasks between several computing nodes that independently and simultaneously work to find the result. This method has achieved up to two times faster processing time in finding the TKD query result in comparison to previously presented algorithms
Symbolic Logic meets Machine Learning: A Brief Survey in Infinite Domains
The tension between deduction and induction is perhaps the most fundamental
issue in areas such as philosophy, cognition and artificial intelligence (AI).
The deduction camp concerns itself with questions about the expressiveness of
formal languages for capturing knowledge about the world, together with proof
systems for reasoning from such knowledge bases. The learning camp attempts to
generalize from examples about partial descriptions about the world. In AI,
historically, these camps have loosely divided the development of the field,
but advances in cross-over areas such as statistical relational learning,
neuro-symbolic systems, and high-level control have illustrated that the
dichotomy is not very constructive, and perhaps even ill-formed. In this
article, we survey work that provides further evidence for the connections
between logic and learning. Our narrative is structured in terms of three
strands: logic versus learning, machine learning for logic, and logic for
machine learning, but naturally, there is considerable overlap. We place an
emphasis on the following "sore" point: there is a common misconception that
logic is for discrete properties, whereas probability theory and machine
learning, more generally, is for continuous properties. We report on results
that challenge this view on the limitations of logic, and expose the role that
logic can play for learning in infinite domains
08421 Abstracts Collection -- Uncertainty Management in Information Systems
From October 12 to 17, 2008 the Dagstuhl Seminar 08421 \u27`Uncertainty Management in Information Systems \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. The abstracts of the plenary and session talks given during the seminar as well as those of the shown demos are put together in this paper
Computing Possible and Certain Answers over Order-Incomplete Data
This paper studies the complexity of query evaluation for databases whose
relations are partially ordered; the problem commonly arises when combining or
transforming ordered data from multiple sources. We focus on queries in a
useful fragment of SQL, namely positive relational algebra with aggregates,
whose bag semantics we extend to the partially ordered setting. Our semantics
leads to the study of two main computational problems: the possibility and
certainty of query answers. We show that these problems are respectively
NP-complete and coNP-complete, but identify tractable cases depending on the
query operators or input partial orders. We further introduce a duplicate
elimination operator and study its effect on the complexity results.Comment: 55 pages, 56 references. Extended journal version of
arXiv:1707.07222. Up to the stylesheet, page/environment numbering, and
possible minor publisher-induced changes, this is the exact content of the
journal paper that will appear in Theoretical Computer Scienc
- âŠ