Search CORE

38 research outputs found

Extending RapidMiner with data search and integration capabilities

Author: Bizer Christian
Gentile Anna Lisa
Kirstein Sabrina
Paulheim Heiko
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

MAnnheim DOCument Server

Recommended from our members

SemTab 2019: Resources to Benchmark Tabular Data to Knowledge Graph Matching Systems

Author: E Jiménez-Ruiz
E Kacprzak
G Limaye
J Euzenat
J Euzenat
Jiaoyan Chen
Mauricio A. Hernández
V Efthymiou
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Tabular data to Knowledge Graph matching is the process of assigning semantic tags from knowledge graphs (e.g., Wikidata or DBpedia) to the elements of a table. This task is a challenging problem for various reasons, including the lack of metadata (e.g., table and column names), the noisiness, heterogeneity, incompleteness and ambiguity in the data. The results of this task provide significant insights about potentially highly valuable tabular data, as recent works have shown, enabling a new family of data analytics and data science applications. Despite significant amount of work on various flavors of this problem, there is a lack of a common framework to conduct a systematic evaluation of state-of-the-art systems. The creation of the Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab) aims at filling this gap. In this paper, we report about the datasets, infrastructure and lessons learned from the first edition of the SemTab challenge

City Research Online

Crossref

NORA - Norwegian Open Research Archives

Automatic Construction of Knowledge Graphs from Text and Structured Data: A Preliminary Literature Review

Author: Buitelaar Paul
Masoud Maraim
McCrae John
Pereira Bianca
Publication venue: OASIcs - OpenAccess Series in Informatics. 3rd Conference on Language, Data and Knowledge (LDK 2021)
Publication date: 01/01/2021
Field of study

Knowledge graphs have been shown to be an important data structure for many applications, including chatbot development, data integration, and semantic search. In the enterprise domain, such graphs need to be constructed based on both structured (e.g. databases) and unstructured (e.g. textual) internal data sources; preferentially using automatic approaches due to the costs associated with manual construction of knowledge graphs. However, despite the growing body of research that leverages both structured and textual data sources in the context of automatic knowledge graph construction, the research community has centered on either one type of source or the other. In this paper, we conduct a preliminary literature review to investigate approaches that can be used for the integration of textual and structured data sources in the process of automatic knowledge graph construction. We highlight the solutions currently available for use within enterprises and point areas that would benefit from further research

ZENODO

Dagstuhl Research Online Publication Server

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Extracting new knowledge from web tables: Novelty or confidence?

Author: Boncz P.A. (Peter)
Kruit B.B. (Benno)
Urbani J. (Jacopo)
Publication venue
Publication date: 09/02/2018
Field of study

To extend the coverage of Knowledge Bases (KBs), it is useful to integrate factual information from public tabular data. Ideally, the extracted information should not only be correct, but also novel. So far, the evaluation of state-of-the-art techniques for this task has focused primarily on the correctness of the extractions, but the novelty is less well analysed. To fill this gap, we replicated the evaluation of two state-of-the-art techniques and analyse the amount of novel extractions using two new metrics. We observe that current techniques are biased towards confidence, but this comes at the expense of novelty. We sketch a possible solution for this problem as part of our ongoing research

CWI's Institutional Repository

TAPON: a two-phase machine learning approach for semantic labelling

Author: Ayala Hernández Daniel
Hernández Salmerón Inmaculada Concepción
Ruiz Cortés David
Ruiz Cortés David (Coordinador)
Toro Bonilla Miguel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Through semantic labelling we enrich structured information from sources such as HTML pages, tables, or JSON files, with labels to integrate it into a local ontology. This process involves measuring some features of the information and then nding the classes that best describe it. The problem with current techniques is that they do not model relationships between classes. Their features fall short when some classes have very similar structures or textual formats. In order to deal with this problem, we have devised TAPON: a new semantic labelling technique that computes novel features that take into account the relationships. TAPON computes these features by means of a two-phase approach. In the first phase, we compute simple features and obtain a preliminary set of labels (hints). In the second phase, we inject our novel features and obtain a refined set of labels. Our experimental results show that our technique, thanks to our rich feature catalogue and novel modelling, achieves higher accuracy than other state-of-the-art techniques.Ministerio de Economía y Competitividad TIN2016-75394-

idUS. Depósito de Investigación Universidad de Sevilla

TAKCO: A platform for extracting novel facts from tables

Author: Boncz P.A. (Peter)
Kruit B.B. (Benno)
Urbani J. (Jacopo)
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/04/2021
Field of study

Web tables contain a large amount of useful knowledge. Takco is a new large-scale platform designed for extracting facts from tables that can be added to Knowledge Graphs (KGs) like Wikidata. Focusing on achieving high precision, current techniques are biased towards extracting redundant facts, i.e., facts already in the KG. Takco aims to find more novel facts, still at high precision. Our demonstration has two goals. The first one is to illustrate the main features of Takco's novel interpretation algorithm. The second goal is to show to what extent other state-of-the-art systems are biased towards the extraction of redundant facts using our platform, thus raising awareness on this important problem

VU Research Portal

CWI's Institutional Repository

Leveraging 2-hop Distant Supervision from Table Entity Pairs for Relation Extraction

Author: Deng Xiang
Sun Huan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Distant supervision (DS) has been widely used to automatically construct (noisy) labeled data for relation extraction (RE). Given two entities, distant supervision exploits sentences that directly mention them for predicting their semantic relation. We refer to this strategy as 1-hop DS, which unfortunately may not work well for long-tail entities with few supporting sentences. In this paper, we introduce a new strategy named 2-hop DS to enhance distantly supervised RE, based on the observation that there exist a large number of relational tables on the Web which contain entity pairs that share common relations. We refer to such entity pairs as anchors for each other, and collect all sentences that mention the anchor entity pairs of a given target entity pair to help relation prediction. We develop a new neural RE method REDS2 in the multi-instance learning paradigm, which adopts a hierarchical model structure to fuse information respectively from 1-hop DS and 2-hop DS. Extensive experimental results on a benchmark dataset show that REDS2 can consistently outperform various baselines across different settings by a substantial margin

arXiv.org e-Print Archive

Crossref