3 research outputs found

    Attribute-Level Versioning: A Relational Mechanism for Version Storage and Retrieval

    Get PDF
    Data analysts today have at their disposal a seemingly endless supply of data and repositories hence, datasets from which to draw. New datasets become available daily thus making the choice of which dataset to use difficult. Furthermore, traditional data analysis has been conducted using structured data repositories such as relational database management systems (RDBMS). These systems, by their nature and design, prohibit duplication for indexed collections forcing analysts to choose one value for each of the available attributes for an item in the collection. Often analysts discover two or more datasets with information about the same entity. When combining this data and transforming it into a form that is usable in an RDBMS, analysts are forced to deconflict the collisions and choose a single value for each duplicated attribute containing differing values. This deconfliction is the source of a considerable amount of guesswork and speculation on the part of the analyst in the absence of professional intuition. One must consider what is lost by discarding those alternative values. Are there relationships between the conflicting datasets that have meaning? Is each dataset presenting a different and valid view of the entity or are the alternate values erroneous? If so, which values are erroneous? Is there a historical significance of the variances? The analysis of modern datasets requires the use of specialized algorithms and storage and retrieval mechanisms to identify, deconflict, and assimilate variances of attributes for each entity encountered. These variances, or versions of attribute values, contribute meaning to the evolution and analysis of the entity and its relationship to other entities. A new, distinct storage and retrieval mechanism will enable analysts to efficiently store, analyze, and retrieve the attribute versions without unnecessary complexity or additional alterations of the original or derived dataset schemas. This paper presents technologies and innovations that assist data analysts in discovering meaning within their data and preserving all of the original data for every entity in the RDBMS

    An Extended Algebra for Constraint Databases

    No full text
    Constraint relational databases use constraints to both model and query data. A constraint relation contains a finite set of generalized tuples. Each generalized tuple is represented by a conjunction of constraints on a given logical theory and, depending on the logical theory and the specific conjunction of constraints, it may possibly represent an infinite set of relational tuples. For their characteristics, constraint databases are well suited to model multidimensional and structured data, like spatial and temporal data. The definition of an algebra for constraint relational databases is important in order to make constraint databases a practical technology. In this paper, we extend the previously defined constraint algebra (called generalized relational algebra). First, we show that the relational model is not the only possible semantic reference model for constraint relational databases and we show how constraint relations can be interpreted under the nested relational model. Then..
    corecore