10 research outputs found

    Survey: Models and Prototypes of Schema Matching

    Get PDF
    Schema matching is critical problem within many applications to integration of data/information, to achieve interoperability, and other cases caused by schematic heterogeneity. Schema matching evolved from manual way on a specific domain, leading to a new models and methods that are semi-automatic and more general, so it is able to effectively direct the user within generate a mapping among elements of two the schema or ontologies better. This paper is a summary of literature review on models and prototypes on schema matching within the last 25 years to describe the progress of and research chalenge and opportunities on a new models, methods, and/or prototypes

    Automatisierte Umsetzung von komplexen XML-SchemaÀnderungen

    Get PDF
    Dieser Beitrag untersucht die Frage, wie komplexe Änderungen bei der Evolution von XML-Schemas automatisiert unterstĂŒtzt werden können. Hierzu werden mögliche Änderungen an XML-Schemas durch Evolutionsoperatoren beschrieben, klassifiziert und beurteilt. Im Speziellen wird die Verschiebung (Move) von Elementen innerhalb von XML-Schemas analysiert. Die automatisierte Generierung und AusfĂŒhrung von Transformationsregeln zur Migration von Instanzda-ten und die Beurteilung möglicher Informationsverluste wĂ€hrend einer Transformation wird allgemein, wie auch anhand von Publikationsdaten untersucht

    On view processing for a native XML DBMS

    Get PDF
    Master'sMASTER OF SCIENC

    Evaluierung von Clio zur Transformation von Metamodellen

    Get PDF
    Clio ist ein Tool zur teilautomatischen Erzeugung von Schema Mappings und der anschließenden Transformation der Instanz eines Quellschemas in die Instanz eines Zielschemas. Ein Metamodell ist das Modell eines Modells und dient zur Beschreibung seiner Elemente und ihrer Beziehungen zueinander. Ecore ist eine Implementierung der Meta Object Facility, der standardisierten Sprache der Object Management Group (OMG) zur Beschreibung von Metamodellen. Diese Arbeit untersucht Clio in Anwendung auf Ecore-basierte Metamodelle. Es soll festgestellt werden, ob ein Einsatz von Clio zur Transformation dieser Metamodelle möglich und sinnvoll ist. Dabei wird die Bedienung Clios mit besonderem Augenmerk auf den notwendigen Input untersucht. Anschließend wird eine Methode entwickelt, um Metamodelle entsprechend umzuformen. Schließlich werden diese umgeformten Metamodelle verwendet, um sie mit Clio zu transformieren.Clio is a tool for the semi-automatic generation of schema mappings and the following transformation of an instance of a source schema into the instance of a target schema. A metamodel is the model of a model. It is used to describe model elements and their relationships to each other. Ecore is an implementation of the Meta Object Facility – the standardized language of the Object Management Group (OMG) for the description of metamodels. This thesis evaluates the application of Clio to Ecore-based metamodels. The goal is an evaluation of the pros and cons of using Clio as a tool for the transformation of Ecore-based metamodels. Therefore it is necessary to examine how to use Clio focusing on the required input. Subsequently, a method to translate metamodels is developed. Finally, Clio is used to transform these metamodels

    The Basics of Complex Correspondences and Functions and their Implementation and Semi-automatic Detection in COMA++

    Get PDF
    In der vorliegenden Masterarbeit wird erlĂ€utert, wie ein klassischer Schema Matcher erweitert wird, um Komplexe Korrespondenzen (many-to-many-Korrespondenzen) und allgemeine Funktionen zwischen zwei Schemata auszudrĂŒcken, sowie deren automatische Entdeckung als Erweiterung der herkömmlichen Entdeckung von (1:1)-Korrespondenzen. Der letzte Punkt widmet sich dabei einem Gebiet der Datenintegration, das bisher kaum untersucht wurde, und es werden AnsĂ€tze vorgestellt, die fĂŒr viele Schema Matcher eine Bereicherung darstellen können. Zu diesem Zweckwerden im ersten Teil der Arbeit Komplexe Korrespondenzen und Funktionen im Bereich des Schema Mappings ausfĂŒhrlich vorgestellt

    Semantic Enrichment of Ontology Mappings

    Get PDF
    Schema and ontology matching play an important part in the field of data integration and semantic web. Given two heterogeneous data sources, meta data matching usually constitutes the first step in the data integration workflow, which refers to the analysis and comparison of two input resources like schemas or ontologies. The result is a list of correspondences between the two schemas or ontologies, which is often called mapping or alignment. Many tools and research approaches have been proposed to automatically determine those correspondences. However, most match tools do not provide any information about the relation type that holds between matching concepts, for the simple but important reason that most common match strategies are too simple and heuristic to allow any sophisticated relation type determination. Knowing the specific type holding between two concepts, e.g., whether they are in an equality, subsumption (is-a) or part-of relation, is very important for advanced data integration tasks, such as ontology merging or ontology evolution. It is also very important for mappings in the biological or biomedical domain, where is-a and part-of relations may exceed the number of equality correspondences by far. Such more expressive mappings allow much better integration results and have scarcely been in the focus of research so far. In this doctoral thesis, the determination of the correspondence types in a given mapping is the focus of interest, which is referred to as semantic mapping enrichment. We introduce and present the mapping enrichment tool STROMA, which obtains a pre-calculated schema or ontology mapping and for each correspondence determines a semantic relation type. In contrast to previous approaches, we will strongly focus on linguistic laws and linguistic insights. By and large, linguistics is the key for precise matching and for the determination of relation types. We will introduce various strategies that make use of these linguistic laws and are able to calculate the semantic type between two matching concepts. The observations and insights gained from this research go far beyond the field of mapping enrichment and can be also applied to schema and ontology matching in general. Since generic strategies have certain limits and may not be able to determine the relation type between more complex concepts, like a laptop and a personal computer, background knowledge plays an important role in this research as well. For example, a thesaurus can help to recognize that these two concepts are in an is-a relation. We will show how background knowledge can be effectively used in this instance, how it is possible to draw conclusions even if a concept is not contained in it, how the relation types in complex paths can be resolved and how time complexity can be reduced by a so-called bidirectional search. The developed techniques go far beyond the background knowledge exploitation of previous approaches, and are now part of the semantic repository SemRep, a flexible and extendable system that combines different lexicographic resources. Further on, we will show how additional lexicographic resources can be developed automatically by parsing Wikipedia articles. The proposed Wikipedia relation extraction approach yields some millions of additional relations, which constitute significant additional knowledge for mapping enrichment. The extracted relations were also added to SemRep, which thus became a comprehensive background knowledge resource. To augment the quality of the repository, different techniques were used to discover and delete irrelevant semantic relations. We could show in several experiments that STROMA obtains very good results w.r.t. relation type detection. In a comparative evaluation, it was able to achieve considerably better results than related applications. This corroborates the overall usefulness and strengths of the implemented strategies, which were developed with particular emphasis on the principles and laws of linguistics

    Mapping XML and Relational Schemas with Clio

    Get PDF
    Merging and coalescing data from multiple and diverse sources into different data formats continues to be an important problem in modern information systems. Schema Matching, the process of matching elements of a source schema with elements of a target schema, and Schema Mapping, the process of creating a query that maps between two disparate schemas, are at the heart of data integration systems. We demonstrate Clio, a semi-automatic schema mapping tool developed at the IBM Almaden Research Center. In this demonstration we showcase Clio’s mapping engine that allows mapping to and from relational and XML schemas, and takes advantage of data constraints in order to preserve data associations. The semantically correct and complete creation and interpretation of mappings is a highly nontrivial process. Curren

    RĂ©conciliation sĂ©mantique des donnĂ©es et des services mis en Ɠuvre au sein d'une situation collaborative

    Get PDF
    La collaboration entre organisations est l un des principaux enjeux de l Ă©cosystĂšme industriel actuel. L Ă©tablissement d une telle collaboration doit ĂȘtre rĂ©active, afin de saisir les diffĂ©rentes opportunitĂ©s, et flexibles, pour pouvoir s adapter aux changements dans la collaboration. Pour cela, ces collaborations doivent ĂȘtre supportĂ©es par un systĂšme d information (SI) dĂ©diĂ©, en charge de fournir l interopĂ©rabilitĂ© entre les diffĂ©rents SI des partenaires et capable de gĂ©rer les spĂ©cificitĂ©s de la collaboration. Le projet MISE (Mediation Information System Engineering) propose une approche dirigĂ©e par les modĂšles permettant Ă  l utilisateur de concevoir un SystĂšme d Information de MĂ©diation (SIM) adaptĂ© au support de cette collaboration. Deux Ă©tapes sont au coeur de la conception de ce SIM : la gĂ©nĂ©ration du processus mĂ©tier collaboratif depuis une description de la situation (niveau abstrait) et sa transformation en un systĂšme exĂ©cutable (niveau concret). Ce manuscrit s intĂ©resse Ă  cette seconde phase et tente, Ă  l aide de technologies basĂ©es sur la connaissance, de rĂ©concilier ces modĂšles mĂ©tiers avec les services techniques disponibles. AprĂšs une Ă©tude du besoin et des mĂ©thodes existantes d apport sĂ©mantique pour les diffĂ©rents niveaux d abstraction, nous faisons le choix de nous intĂ©resser aux standards SAWSDL et WSMO-Lite au niveau des services et nous proposons un nouveau mĂ©canisme d annotation sĂ©mantique au niveau des processus mĂ©tier (appelĂ© SABPMN), faute de standard reconnu. Les informations sĂ©mantiques ajoutĂ©es aux modĂšles sont ensuite exploitĂ©es lors de la transformation des processus mĂ©tier en workflows exĂ©cutables proposĂ©e ici. Cette transformation se dĂ©roule alors en trois phases : (i) on recherche pour les diffĂ©rentes activitĂ©s mĂ©tier du processus le ou les service(s) qui rĂ©pond(ent) au besoin mĂ©tier exprimĂ© Ă  l aide de mĂ©canismes de sĂ©lection et de composition de services ; (ii) on gĂ©nĂšre pour chaque service Ă  invoquer la transformation de donnĂ©es nĂ©cessaire pour garantir une bonne communication avec les autres composants ; (iii) une fois ces informations validĂ©es par l utilisateur, on gĂ©nĂšre les fichiers nĂ©cessaires Ă  l exĂ©cution de ce processus sur la plateforme collaborative. Les rĂ©sultats de cette thĂšse s inscrivent aussi au sein du projet FUI ISTA3 (InteropĂ©rabilitĂ© de 3Ăšme gĂ©nĂ©ration pour les Sous-Traitants de l AĂ©ronautique) qui se propose d amĂ©liorer l interopĂ©rabilitĂ© de la chaine logistique des sous-traitants aĂ©ronautiques de l Aerospace Valley afin de faciliter la co-conception. Une implĂ©mentation des diffĂ©rents mĂ©canismes proposĂ©s a Ă©tĂ© rĂ©alisĂ©e et est disponible sous la forme d un prototype fonctionnel open-source.Collaboration bewteen organisations is one of nowadays main stakes in industrial ecosystem. Establishment of such collaboration must be reactive, in order to take avantage of opportunities, and flexible, in order to adapt collaboration to context changes. In this view, such collaboration must be supported by a dedicated Information System (IS), responsible for ensuring interoperability between partner s IS and able to manage collaboration specificities. MISE project (Mediation Information System Engineering) provides a model-driven engineering approach dedicated to design a Mediation Information System (MIS) which supports this collaboration. Two steps are involved in the MIS design : generation of business processes from the description of the collaborative situation (abstract level) and transformation of these process models into an executable system (concrete level). This PhD thesis takes interest in the second level trying to match those business models with available technical services, thanks to knowledge based technologies. First, we studied our semantic needs and existing methods of semantic annotation for models from both business and technical levels. We chose SAWSDL and WSMOLite standards for service annotations whereas we provided a new semantic annotation mechanism for business processes (called SABPMN), in the absence of existing standard. Added semantic information is then used during the business processes to executable workflows transformation. This transformation is performed in three steps : (i) for each activity involved in business processes we search for technical services which fit our business needs thanks to our service selection and composition mechanisms ; (ii) we generate for each selected service the required data transformation to ensure correct communication with other components ; (iii) once this information validated by user, we generate technical files expected by the collaborative platform to execute those processes. Those results are in line with the FUI ISTA3 project (3rd generation of Interoperability for Aeronautics Sub-contracTors) which focuses on improving supply chain interoperability for aeronautics sub-contractors of Aerospace Valley in order to facilitate co-design. All proposed transformation and matchmaking mecanisms are implemented as open-source functional prototypes.TOULOUSE-INP (315552154) / SudocSudocFranceF

    RĂ©conciliation sĂ©mantique des donnĂ©es et des services mis en Ɠuvre au sein d’une situation collaborative

    Get PDF
    La collaboration entre organisations est l’un des principaux enjeux de l’écosystĂšme industriel actuel. L’établissement d’une telle collaboration doit ĂȘtre rĂ©active, afin de saisir les diffĂ©rentes opportunitĂ©s, et flexibles, pour pouvoir s’adapter aux changements dans la collaboration. Pour cela, ces collaborations doivent ĂȘtre supportĂ©es par un systĂšme d’information (SI) dĂ©diĂ©, en charge de fournir l’interopĂ©rabilitĂ© entre les diffĂ©rents SI des partenaires et capable de gĂ©rer les spĂ©cificitĂ©s de la collaboration. Le projet MISE (Mediation Information System Engineering) propose une approche dirigĂ©e par les modĂšles permettant Ă  l’utilisateur de concevoir un SystĂšme d’Information de MĂ©diation (SIM) adaptĂ© au support de cette collaboration. Deux Ă©tapes sont au coeur de la conception de ce SIM : la gĂ©nĂ©ration du processus mĂ©tier collaboratif depuis une description de la situation (niveau abstrait) et sa transformation en un systĂšme exĂ©cutable (niveau concret). Ce manuscrit s’intĂ©resse Ă  cette seconde phase et tente, Ă  l’aide de technologies basĂ©es sur la connaissance, de rĂ©concilier ces modĂšles mĂ©tiers avec les services techniques disponibles. AprĂšs une Ă©tude du besoin et des mĂ©thodes existantes d’apport sĂ©mantique pour les diffĂ©rents niveaux d’abstraction, nous faisons le choix de nous intĂ©resser aux standards SAWSDL et WSMO-Lite au niveau des services et nous proposons un nouveau mĂ©canisme d’annotation sĂ©mantique au niveau des processus mĂ©tier (appelĂ© SABPMN), faute de standard reconnu. Les informations sĂ©mantiques ajoutĂ©es aux modĂšles sont ensuite exploitĂ©es lors de la transformation des processus mĂ©tier en workflows exĂ©cutables proposĂ©e ici. Cette transformation se dĂ©roule alors en trois phases : (i) on recherche pour les diffĂ©rentes activitĂ©s mĂ©tier du processus le ou les service(s) qui rĂ©pond(ent) au besoin mĂ©tier exprimĂ© Ă  l’aide de mĂ©canismes de sĂ©lection et de composition de services ; (ii) on gĂ©nĂšre pour chaque service Ă  invoquer la transformation de donnĂ©es nĂ©cessaire pour garantir une bonne communication avec les autres composants ; (iii) une fois ces informations validĂ©es par l’utilisateur, on gĂ©nĂšre les fichiers nĂ©cessaires Ă  l’exĂ©cution de ce processus sur la plateforme collaborative. Les rĂ©sultats de cette thĂšse s’inscrivent aussi au sein du projet FUI ISTA3 (InteropĂ©rabilitĂ© de 3Ăšme gĂ©nĂ©ration pour les Sous-Traitants de l’AĂ©ronautique) qui se propose d’amĂ©liorer l’interopĂ©rabilitĂ© de la chaine logistique des sous-traitants aĂ©ronautiques de l’Aerospace Valley afin de faciliter la co-conception. Une implĂ©mentation des diffĂ©rents mĂ©canismes proposĂ©s a Ă©tĂ© rĂ©alisĂ©e et est disponible sous la forme d’un prototype fonctionnel open-source. ABSTRACT : Collaboration bewteen organisations is one of nowadays main stakes in industrial ecosystem. Establishment of such collaboration must be reactive, in order to take avantage of opportunities, and flexible, in order to adapt collaboration to context changes. In this view, such collaboration must be supported by a dedicated Information System (IS), responsible for ensuring interoperability between partner’s IS and able to manage collaboration specificities. MISE project (Mediation Information System Engineering) provides a model-driven engineering approach dedicated to design a Mediation Information System (MIS) which supports this collaboration. Two steps are involved in the MIS design : generation of business processes from the description of the collaborative situation (abstract level) and transformation of these process models into an executable system (concrete level). This PhD thesis takes interest in the second level trying to match those business models with available technical services, thanks to knowledge based technologies. First, we studied our semantic needs and existing methods of semantic annotation for models from both business and technical levels. We chose SAWSDL and WSMOLite standards for service annotations whereas we provided a new semantic annotation mechanism for business processes (called SABPMN), in the absence of existing standard. Added semantic information is then used during the business processes to executable workflows transformation. This transformation is performed in three steps : (i) for each activity involved in business processes we search for technical services which fit our business needs thanks to our service selection and composition mechanisms ; (ii) we generate for each selected service the required data transformation to ensure correct communication with other components ; (iii) once this information validated by user, we generate technical files expected by the collaborative platform to execute those processes. Those results are in line with the FUI ISTA3 project (3rd generation of Interoperability for Aeronautics Sub-contracTors) which focuses on improving supply chain interoperability for aeronautics sub-contractors of Aerospace Valley in order to facilitate co-design. All proposed transformation and matchmaking mecanisms are implemented as open-source functional prototypes

    Methods for Matching of Linked Open Social Science Data

    Get PDF
    In recent years, the concept of Linked Open Data (LOD), has gained popularity and acceptance across various communities and domains. Science politics and organizations claim that the potential of semantic technologies and data exposed in this manner may support and enhance research processes and infrastructures providing research information and services. In this thesis, we investigate whether these expectations can be met in the domain of the social sciences. In particular, we analyse and develop methods for matching social scientific data that is published as Linked Data, which we introduce as Linked Open Social Science Data. Based on expert interviews and a prototype application, we investigate the current consumption of LOD in the social sciences and its requirements. Following these insights, we first focus on the complete publication of Linked Open Social Science Data by extending and developing domain-specific ontologies for representing research communities, research data and thesauri. In the second part, methods for matching Linked Open Social Science Data are developed that address particular patterns and characteristics of the data typically used in social research. The results of this work contribute towards enabling a meaningful application of Linked Data in a scientific domain
    corecore