3,992 research outputs found

    Aspects of dealing with imperfect data in temporal databases

    Get PDF
    In reality, some objects or concepts have properties with a time-variant or time-related nature. Modelling these kinds of objects or concepts in a (relational) database schema is possible, but time-variant and time-related attributes have an impact on the consistency of the entire database. Therefore, temporal database models have been proposed to deal with this. Time itself can be at the source of imprecision, vagueness and uncertainty, since existing time measuring devices are inherently imperfect. Accordingly, human beings manage time using temporal indications and temporal notions, which may contain imprecision, vagueness and uncertainty. However, the imperfection in human-used temporal indications is supported by human interpretation, whereas information systems need extraordinary support for this. Several proposals for dealing with such imperfections when modelling temporal aspects exist. Some of these proposals consider the basis of the system to be the conversion of the specificity of temporal notions between used temporal expressions. Other proposals consider the temporal indications in the used temporal expressions to be the source of imperfection. In this chapter, an overview is given, concerning the basic concepts and issues related to the modelling of time as such or in (relational) database models and the imperfections that may arise during or as a result of this modelling. Next to this, a novel and currently researched technique for handling some of these imperfections is presented

    Survey over Existing Query and Transformation Languages

    Get PDF
    A widely acknowledged obstacle for realizing the vision of the Semantic Web is the inability of many current Semantic Web approaches to cope with data available in such diverging representation formalisms as XML, RDF, or Topic Maps. A common query language is the first step to allow transparent access to data in any of these formats. To further the understanding of the requirements and approaches proposed for query languages in the conventional as well as the Semantic Web, this report surveys a large number of query languages for accessing XML, RDF, or Topic Maps. This is the first systematic survey to consider query languages from all these areas. From the detailed survey of these query languages, a common classification scheme is derived that is useful for understanding and differentiating languages within and among all three areas

    A Relational Model for the Possibilistic Valid-time Approach

    Get PDF
    In real world, it is very common that some objects or concepts have properties with a time-variant or timerelated nature. Modelling this kind of objects or concepts in a (relational) database schema is possible, but time-variant and time-related attributes have an impact on the consistency of the entire database and must be appropriately managed. Therefore, temporal database models have been proposed to deal with this problem in the literature. Time can be affected by imprecision, vagueness and / or uncertainty, since existing time measuring devices are inherently imperfect. Additionally, human beings manage time using temporal indications and temporal notions, which may also be imprecise. However, the imperfection in human-used temporal indications is supported by human interpretation, whereas information systems need appropriate support in order to accomplish this task. Several proposals for dealing with such imperfections when modelling temporal data exist. Some of these proposals transform the temporal data into a compact representation but there is not a formal model for managing and handling uncertainty regarding temporal information. In this work we present a novel model to deal with imprecision in valid-time databases together with the definition and implementation of the data manipulation language, DML.Junta de Andalucia P07-TIC-03175 BES-2009-013805 TIN2008-0206

    Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses

    Get PDF
    A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total sales amount 1,000 items indicates a good or bad sales performance" is still unclear. From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore the use of fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three layered data warehouse semantic model, consisting of quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed for capturing and explicating the semantics of warehoused data. Based on the model, several algebraic operators are defined. We also extend the SQL language to allow for flexible queries against such enhanced data warehouses

    Treatment of imprecision in data repositories with the aid of KNOLAP

    Get PDF
    Traditional data repositories introduced for the needs of business processing, typically focus on the storage and querying of crisp domains of data. As a result, current commercial data repositories have no facilities for either storing or querying imprecise/ approximate data. No significant attempt has been made for a generic and applicationindependent representation of value imprecision mainly as a property of axes of analysis and also as part of dynamic environment, where potential users may wish to define their “own” axes of analysis for querying either precise or imprecise facts. In such cases, measured values and facts are characterised by descriptive values drawn from a number of dimensions, whereas values of a dimension are organised as hierarchical levels. A solution named H-IFS is presented that allows the representation of flexible hierarchies as part of the dimension structures. An extended multidimensional model named IF-Cube is put forward, which allows the representation of imprecision in facts and dimensions and answering of queries based on imprecise hierarchical preferences. Based on the H-IFS and IF-Cube concepts, a post relational OLAP environment is delivered, the implementation of which is DBMS independent and its performance solely dependent on the underlying DBMS engine

    Proceedings of the Third International Workshop on Management of Uncertain Data (MUD2009)

    Get PDF

    Learning Tuple Probabilities

    Get PDF
    Learning the parameters of complex probabilistic-relational models from labeled training data is a standard technique in machine learning, which has been intensively studied in the subfield of Statistical Relational Learning (SRL), but---so far---this is still an under-investigated topic in the context of Probabilistic Databases (PDBs). In this paper, we focus on learning the probability values of base tuples in a PDB from labeled lineage formulas. The resulting learning problem can be viewed as the inverse problem to confidence computations in PDBs: given a set of labeled query answers, learn the probability values of the base tuples, such that the marginal probabilities of the query answers again yield in the assigned probability labels. We analyze the learning problem from a theoretical perspective, cast it into an optimization problem, and provide an algorithm based on stochastic gradient descent. Finally, we conclude by an experimental evaluation on three real-world and one synthetic dataset, thus comparing our approach to various techniques from SRL, reasoning in information extraction, and optimization
    • …
    corecore