Search CORE

325 research outputs found

Reinforcement Learning-based Collective Entity Alignment with Adaptive Features

Author: Groth P.
Lin X.
Tang J.
Zeng W.
Zhao X.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/2021
Field of study

International Migration, Integration and Social Cohesion online publications

Reinforcement Learning-based Collective Entity Alignment with Adaptive Features

Author: Groth P.
Lin X.
Tang J.
Zeng W.
Zhao X.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/2021
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Deep Active Alignment of Knowledge Graph Entities and Schemata

Author: Chen Qijin
Hu Wei
Huang Jiacheng
Ren Weijun
Sun Zequn
Xu Xiaozhou
Publication venue
Publication date: 10/04/2023
Field of study

Knowledge graphs (KGs) store rich facts about the real world. In this paper, we study KG alignment, which aims to find alignment between not only entities but also relations and classes in different KGs. Alignment at the entity level can cross-fertilize alignment at the schema level. We propose a new KG alignment approach, called DAAKG, based on deep learning and active learning. With deep learning, it learns the embeddings of entities, relations and classes, and jointly aligns them in a semi-supervised manner. With active learning, it estimates how likely an entity, relation or class pair can be inferred, and selects the best batch for human labeling. We design two approximation algorithms for efficient solution to batch selection. Our experiments on benchmark datasets show the superior accuracy and generalization of DAAKG and validate the effectiveness of all its modules.Comment: Accepted in the ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD 2023

arXiv.org e-Print Archive

Doctor of Philosophy

Author: Nguyen Thanh Hoang
Publication venue: University of Utah
Publication date: 01/05/2013
Field of study

dissertationThe explosion of structured Web data (e.g., online databases, Wikipedia infoboxes) creates many opportunities for integrating and querying these data that go far beyond the simple search capabilities provided by search engines. Although much work has been devoted to data integration in the database community, the Web brings new challenges: the Web-scale (e.g., the large and growing volume of data) and the heterogeneity in Web data. Because there are so much data, scalable techniques that require little or no manual intervention and that are robust to noisy data are needed. In this dissertation, we propose a new and effective approach for matching Web-form interfaces and for matching multilingual Wikipedia infoboxes. As a further step toward these problems, we propose a general prudent schema-matching framework that matches a large number of schemas effectively. Our comprehensive experiments for Web-form interfaces and Wikipedia infoboxes show that it can enable on-the-fly, automatic integration of large collections of structured Web data. Another problem we address in this dissertation is schema discovery. While existing integration approaches assume that the relevant data sources and their schemas have been identified in advance, schemas are not always available for structured Web data. Approaches exist that exploit information in Wikipedia to discover the entity types and their associate schemas. However, due to inconsistencies, sparseness, and noise from the community contribution, these approaches are error prone and require substantial human intervention. Given the schema heterogeneity in Wikipedia infoboxes, we developed a new approach that uses the structured information available in infoboxes to cluster similar infoboxes and infer the schemata for entity types. Our approach is unsupervised and resilient to the unpredictable skew in the entity class distribution. Our experiments, using over one hundred thousand infoboxes extracted from Wikipedia, indicate that our approach is effective and produces accurate schemata for Wikipedia entities

The University of Utah: J. Willard Marriott Digital Library

When in doubt ask the crowd : leveraging collective intelligence for improving event detection and machine learning

Author: Georgescu Mihai
Publication venue: Hannover : Gottfried Wilhelm Leibniz Universität Hannover
Publication date: 01/01/2015
Field of study

[no abstract

Institutionelles Repositorium der Leibniz Universität Hannover