913 research outputs found

    Normalization Techniques For Improving The Performance Of Knowledge Graph Creation Pipelines

    Get PDF
    With the rapid growth of data within the web, demands on discovering information within data and consecutively exploiting knowledge graphs rise much more than we think it does. Data integration systems can be of great help to meet this precious demand in that they offer transformation of data from various sources and with different volumes. To this end, a data integration system takes advantage of utilizing mapping rules-- specified in a language like RML -- to integrate data collected from various data sources into a knowledge graph. However, large data sources may suffer from various data quality issues, being redundant one of them. Regarding this, the Semantic Web community contributes to Knowledge Engineering with techniques to create a knowledge graph efficiently. The thesis reported in this document tackles creating knowledge graphs in the presence of data sources with redundant data, and a novel normalization theory is proposed to solve this problem. This theory covers not only the characteristics of the data sources but also mapping rules used to integrate the data sources into a knowledge graph. Based on this, three normal forms are proposed and an algorithm for transforming mapping rules and data sources into these normal forms. The proposed approach's performance is evaluated in different testbeds composed of real-world data and synthetic data. The observed results suggest that the proposed techniques can dramatically reduce the execution time of knowledge graph creation. Therefore, this thesis's normalization theory contributes to the repertoire of tools that facilitate the creation of knowledge graphs at scale

    Ensuring the existence of a BCNF-decomposition that preserves functional dependencies in O (N2) time

    Get PDF
    A simple condition is presented that ensures that a relation scheme R with a set F of functional dependencies has a Boyce-Codd normal form (BCNF)-decomposition that has the lossless-join property and preserves functional dependencies

    On redundancy, anomalies and on the question "what do normal forms really do"

    Get PDF
    In this paper we first survey various examples for anomalies given in the literature [1,3,8]. We discuss the formalizations and relate them to each other and the examples. We give arguments that show that decomposition of a relation scheme can help in getting rid of deletion/insertion anomalies and can fail in getting rid of update anomalies in the decomposed case

    Design of petroleum company's metadata and an effective knowledge mapping methodology

    Get PDF
    Success of information flow depends on intelligent datastorage and its management in a multi-disciplinaryenvironment. Multi-dimensional data entities, data typesand ambiguous semantics, often pose uncertainty andinconsistency in data retrieval from volumes of petroleumdata sources. In our approach, conceptual schemas andsub-schemas have been described based on variousoperational functions of the petroleum industry. Theseschemas are integrated, to ensure their consistency andvalidity, so that the information retrieved from anintegrated metadata (in the form of a data warehouse)structure derives its authenticity from its implementation.The data integration process validating the petroleummetadata has been demonstrated for one of the Gulfoffshore basins for an effective knowledge mapping andinterpreting it successfully for the derivation of usefulgeological knowledge. Warehoused data are used formining data patterns, trends and correlations amongknowledge-base data attributes that led to interpretation ofinteresting geological features. These technologies appearto be more amenable for exploration of more petroleumresources in the mature gulf basins
    • …
    corecore