11 research outputs found

    Implementation of Tuned Schema Merging Approach

    Get PDF
    Schema merging is a process of integrating multiple data sources into a GCS (Global Conceptual Schema). It is pivotal to various application domains, like data ware housing and multi-databases. Schema merging requires the identification of corresponding elements, which is done through schema matching process. In this process, corresponding elements across multiple data sources are identified after the comparison of these data sources with each other. In this way, for a given set of data sources and the correspondence between them, different possibilities for creating GCS can be achieved. In applications like multi-databases and data warehousing, new data sources keep joining in and GCS relations are usually expanded horizontally or vertically. Schema merging approaches usually expand GCS relations horizontally or vertically as new data sources join in. As a result of such expansions, an unbalanced GCS is created which either produces too much NULL values in response to global queries or a result of too many Joins causes poor query processing. In this paper, a novel approach, TuSMe (Tuned Schema Merging) techniqueis introduced to overcome the above mentioned issue via developing a balanced GCS, which will be able to control both vertical and horizontal expansion of GCS relations. The approach employs a weighting mechanism in which the weights are assigned to individual attributes of GCS. These weights reflect the connectedness of GCS attributes in accordance with the attributes of the principle data sources. Moreover, the overall strength of the GCS could be scrutinized by combining these weights. A prototype implementation of TuSMe shows significant improvement against other contemporary state-of-the-art approaches

    Top-k generation of integrated schemas based on directed and weighted correspondences

    Full text link
    Schema integration is the problem of creating a unified target schema based on a set of existing source schemas and based on a set of cor-respondences that are the result of matching the source schemas. Previous methods for schema integration rely on the exploration, implicit or explicit, of the multiple design choices that are possible for the integrated schema. Such exploration relies heavily on user interaction; thus, it is time consuming and labor intensive. Further-more, previous methods have ignored the additional information that typically results from the schema matching process, that is, the weights and in some cases the directions that are associated with the correspondences. In this paper, we propose a more automatic approach to schema integration that is based on the use of directed and weighted corre-spondences between the concepts that appear in the source schemas. A key component of our approach is a novel top-k ranking algo-rithm for the automatic generation of the best candidate schemas. The algorithm gives more weight to schemas that combine the con-cepts with higher similarity or coverage. Thus, the algorithm makes certain decisions that otherwise would likely be taken by a human expert. We show that the algorithm runs in polynomial time and moreover has good performance in practice

    A framework for information integration using ontological foundations

    Get PDF
    With the increasing amount of data, ability to integrate information has always been a competitive advantage in information management. Semantic heterogeneity reconciliation is an important challenge of many information interoperability applications such as data exchange and data integration. In spite of a large amount of research in this area, the lack of theoretical foundations behind semantic heterogeneity reconciliation techniques has resulted in many ad-hoc approaches. In this thesis, I address this issue by providing ontological foundations for semantic heterogeneity reconciliation in information integration. In particular, I investigate fundamental semantic relations between properties from an ontological point of view and show how one of the basic and natural relations between properties – inferring implicit properties from existing properties – can be used to enhance information integration. These ontological foundations have been exploited in four aspects of information integration. First, I propose novel algorithms for semantic enrichment of schema mappings. Second, using correspondences between similar properties at different levels of abstraction, I propose a configurable data integration system, in which query rewriting techniques allows the tradeoff between accuracy and completeness in query answering. Third, to keep the semantics in data exchange, I propose an entity preserving data exchange approach that reflects source entities in the target independent of classification of entities. Finally, to improve the efficiency of the data exchange approach proposed in this thesis, I propose an extended model of the column-store model called sliced column store. Working prototypes of the techniques proposed in this thesis are implemented to show the feasibility of realizing these techniques. Experiments that have been performed using various datasets show the techniques proposed in this thesis outperform many existing techniques in terms of ability to handle semantic heterogeneities and performance of information exchange

    Ein Repository für Modellierungsmethoden

    Get PDF
    Im Bereich der Wirtschaftsinformatik gewinnen Modelle und vor allem die verwendeten Modellierungsmethoden immer mehr an Bedeutung. Daher wurde 2008 die Open Model Initiative gegründet, die sich mit der Entwicklung und Bereitstellung von Modellierungsmethoden und deren Anwendungsmöglichkeiten beschäftigt. Um die Entwickler von Modellierungsmethoden zu unterstützen wird nun ein Repositorykonzept benötigt. Dadurch sollen dem Anwender sowohl Verwaltungsfunktionalität als auch Analysemöglichkeiten geboten werden. Die Konzeption dieses Repositories basiert im Gegensatz zu den in der Literatur am häufigsten auftretenden Datenbank-basierenden Repositorykonzepten auf einem Metamodellierungsansatz, wodurch sich insbesonders einige wesentliche Vorteile ergeben. Diese Vorteile sind vor allem die einfache Integration der Modellierungsumgebung für die verwalteten Modellierungsmethoden sowie die Verwendung von Metamodellierungskonzepten sowohl für das Repository als auch für die Anwendung. Ziel dieser Arbeit ist es nun ein Konzept für ein Modellierungsmethoden-Repository zu erstellen, welches alle notwendigen Funktionalitäten für deren Verwaltung zur Verfügung stellt und an die Bedürfnisse der Methodenentwickler angepasst ist. Das Konzept soll anschließend die Spezifikationsgrundlage für eine darauffolgende Implementierung bieten, wodurch eine weitere Verwendung des Repositories innerhalb der Open Model Initiative geboten wird.Models and their used modelling methods become more important in the field of business informatics. Therefore, the Open Models Initiative was founded in 2008. This initiative deals with the development and supply of modelling methods and their applicability. A repository concept is needed due to the growing number of modelling methods and to support the method developer. This ensures the provision of management functionality as well as analytical possibilities. The conception of this repository is based on a metamodelling approach in contrast to the most common concepts based on database technologies. This results in various advantages like easy integration of the modelling methods into the modelling environment and the usage of metamodelling approaches for the repository as well as for the use. Aim of this work is the creation of a concept for a modelling method repository. This concept specifies all necessary functionalities and is adapted to the needs of method developers. The following implementation of the repository is based on this provided specification. Through which further usage of the repository within the Open Models Initiative is given

    Data linkage for querying heterogeneous databases

    Get PDF
    corecore