12,683 research outputs found

    XML Matchers: approaches and challenges

    Full text link
    Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure

    VeeAlign: Multifaceted Context Representation Using Dual Attention for Ontology Alignment

    Get PDF
    Ontology Alignment is an important research problem applied to various fields such as data integration, data transfer, data preparation, etc. State-of-the-art (SOTA) Ontology Alignment systems typically use naive domain-dependent approaches with handcrafted rules or domain-specific architectures, making them unscalable and inefficient. In this work, we propose VeeAlign, a Deep Learning based model that uses a novel dual-attention mechanism to compute the contextualized representation of a concept which, in turn, is used to discover alignments. By doing this, not only is our approach able to exploit both syntactic and semantic information encoded in ontologies, it is also, by design, flexible and scalable to different domains with minimal effort. We evaluate our model on four different datasets from different domains and languages, and establish its superiority through these results as well as detailed ablation studies. The code and datasets used are available at https://github.com/Remorax/VeeAlign.Comment: Duplicate of arXiv:2010.1172

    Local matching learning of large scale biomedical ontologies

    Get PDF
    Les larges ontologies biomédicales décrivent généralement le même domaine d'intérêt, mais en utilisant des modèles de modélisation et des vocabulaires différents. Aligner ces ontologies qui sont complexes et hétérogènes est une tâche fastidieuse. Les systèmes de matching doivent fournir des résultats de haute qualité en tenant compte de la grande taille de ces ressources. Les systèmes de matching d'ontologies doivent résoudre deux problèmes: (i) intégrer la grande taille d'ontologies, (ii) automatiser le processus d'alignement. Le matching d'ontologies est une tâche difficile en raison de la large taille des ontologies. Les systèmes de matching d'ontologies combinent différents types de matcher pour résoudre ces problèmes. Les principaux problèmes de l'alignement de larges ontologies biomédicales sont: l'hétérogénéité conceptuelle, l'espace de recherche élevé et la qualité réduite des alignements résultants. Les systèmes d'alignement d'ontologies combinent différents matchers afin de réduire l'hétérogénéité. Cette combinaison devrait définir le choix des matchers à combiner et le poids. Différents matchers traitent différents types d'hétérogénéité. Par conséquent, le paramétrage d'un matcher devrait être automatisé par les systèmes d'alignement d'ontologies afin d'obtenir une bonne qualité de correspondance. Nous avons proposé une approche appele "local matching learning" pour faire face à la fois à la grande taille des ontologies et au problème de l'automatisation. Nous divisons un gros problème d'alignement en un ensemble de problèmes d'alignement locaux plus petits. Chaque problème d'alignement local est indépendamment aligné par une approche d'apprentissage automatique. Nous réduisons l'énorme espace de recherche en un ensemble de taches de recherche de corresondances locales plus petites. Nous pouvons aligner efficacement chaque tache de recherche de corresondances locale pour obtenir une meilleure qualité de correspondance. Notre approche de partitionnement se base sur une nouvelle stratégie à découpes multiples générant des partitions non volumineuses et non isolées. Par conséquence, nous pouvons surmonter le problème de l'hétérogénéité conceptuelle. Le nouvel algorithme de partitionnement est basé sur le clustering hiérarchique par agglomération (CHA). Cette approche génère un ensemble de tâches de correspondance locale avec un taux de couverture suffisant avec aucune partition isolée. Chaque tâche d'alignement local est automatiquement alignée en se basant sur les techniques d'apprentissage automatique. Un classificateur local aligne une seule tâche d'alignement local. Les classificateurs locaux sont basés sur des features élémentaires et structurelles. L'attribut class de chaque set de donne d'apprentissage " training set" est automatiquement étiqueté à l'aide d'une base de connaissances externe. Nous avons appliqué une technique de sélection de features pour chaque classificateur local afin de sélectionner les matchers appropriés pour chaque tâche d'alignement local. Cette approche réduit la complexité d'alignement et augmente la précision globale par rapport aux méthodes d'apprentissage traditionnelles. Nous avons prouvé que l'approche de partitionnement est meilleure que les approches actuelles en terme de précision, de taux de couverture et d'absence de partitions isolées. Nous avons évalué l'approche d'apprentissage d'alignement local à l'aide de diverses expériences basées sur des jeux de données d'OAEI 2018. Nous avons déduit qu'il est avantageux de diviser une grande tâche d'alignement d'ontologies en un ensemble de tâches d'alignement locaux. L'espace de recherche est réduit, ce qui réduit le nombre de faux négatifs et de faux positifs. L'application de techniques de sélection de caractéristiques à chaque classificateur local augmente la valeur de rappel pour chaque tâche d'alignement local.Although a considerable body of research work has addressed the problem of ontology matching, few studies have tackled the large ontologies used in the biomedical domain. We introduce a fully automated local matching learning approach that breaks down a large ontology matching task into a set of independent local sub-matching tasks. This approach integrates a novel partitioning algorithm as well as a set of matching learning techniques. The partitioning method is based on hierarchical clustering and does not generate isolated partitions. The matching learning approach employs different techniques: (i) local matching tasks are independently and automatically aligned using their local classifiers, which are based on local training sets built from element level and structure level features, (ii) resampling techniques are used to balance each local training set, and (iii) feature selection techniques are used to automatically select the appropriate tuning parameters for each local matching context. Our local matching learning approach generates a set of combined alignments from each local matching task, and experiments show that a multiple local classifier approach outperforms conventional, state-of-the-art approaches: these use a single classifier for the whole ontology matching task. In addition, focusing on context-aware local training sets based on local feature selection and resampling techniques significantly enhances the obtained results

    Towards ensuring Satisfiability of Merged Ontology

    Get PDF
    AbstractThe last decade has seen researchers developing efficient algorithms for the mapping and merging of ontologies to meet the demands of interoperability between heterogeneous and distributed information systems. But, still state-of-the-art ontology mapping and merging systems is semi-automatic that reduces the burden of manual creation and maintenance of mappings, and need human intervention for their validation. The contribution presented in this paper makes human intervention one step more down by automatically identifying semantic inconsistencies in the early stages of ontology merging. Our methodology detects inconsistencies based on structural mismatches that occur due to conflicts among the set of Generalized Concept Inclusions, and Disjoint Relations due to the differences between disjoint partitions in the local heterogeneous ontologies. We present novel methodologies to detect and repair semantic inconsistencies from the list of initial mappings. This results in global merged ontology free from ‘circulatory error in class/property hierarchy’, „common class/instance between disjoint classes error’, ‘redundancy of subclass/subproperty relations’, ‘redundancy of disjoint relations’ and other types of „semantic inconsistency’ errors. In this way, our methodology saves time and cost of traversing local ontologies for the validation of mappings, improves performance by producing only consistent accurate mappings, and reduces the user dependability for ensuring the satisfiability and consistency of merged ontology. The experiments show that the newer approach with automatic inconsistency detection yields a significantly higher precision

    Semantic Matchmaking as Non-Monotonic Reasoning: A Description Logic Approach

    Full text link
    Matchmaking arises when supply and demand meet in an electronic marketplace, or when agents search for a web service to perform some task, or even when recruiting agencies match curricula and job profiles. In such open environments, the objective of a matchmaking process is to discover best available offers to a given request. We address the problem of matchmaking from a knowledge representation perspective, with a formalization based on Description Logics. We devise Concept Abduction and Concept Contraction as non-monotonic inferences in Description Logics suitable for modeling matchmaking in a logical framework, and prove some related complexity results. We also present reasonable algorithms for semantic matchmaking based on the devised inferences, and prove that they obey to some commonsense properties. Finally, we report on the implementation of the proposed matchmaking framework, which has been used both as a mediator in e-marketplaces and for semantic web services discovery
    • …
    corecore