Search CORE

1,149 research outputs found

Refining Automatically Extracted Knowledge Bases Using Crowdsourcing

Author: Chunhua Li
Jian Wu
Pengpeng Zhao
Victor S. Sheng
Xuefeng Xian
Zhiming Cui
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost

Crossref

Directory of Open Access Journals

Toward Real Event Detection

Author: Färber Michael
Rettinger Achim
Publication venue: RWTH Aachen
Publication date: 01/01/2015
Field of study

News agencies and other news providers or consumers are confronted with the task of extracting events from news articles. This is done i) either to monitor and, hence, to be informed about events of specific kinds over time and/or ii) to react to events immediately. In the past, several promising approaches to extracting events from text have been proposed. Besides purely statistically-based approaches there are methods to represent events in a semantically-structured form, such as graphs containing actions (predicates), participants (entities), etc. However, it turns out to be very dificult to automatically determine whether an event is real or not. In this paper, we give an overview of approaches which proposed solutions for this research problem. We show that there is no gold standard dataset where real events are annotated in text documents in a fine-grained, semantically-enriched way. We present A methodology of creating such a dataset with the help of crowdsourcing and present preliminary results

KITopen

Advanced Semantics for Commonsense Knowledge Extraction

Author: Nguyen Tuan-Phong
Razniewski Simon
Weikum Gerhard
Publication venue
Publication date: 01/01/2021
Field of study

Commonsense knowledge (CSK) about concepts and their properties is useful for AI applications such as robust chatbots. Prior works like ConceptNet, TupleKB and others compiled large CSK collections, but are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and monolithic strings for P and O. Also, these projects have either prioritized precision or recall, but hardly reconcile these complementary goals. This paper presents a methodology, called Ascent, to automatically build a large-scale knowledge base (KB) of CSK assertions, with advanced expressiveness and both better precision and recall than prior works. Ascent goes beyond triples by capturing composite concepts with subgroups and aspects, and by refining assertions with semantic facets. The latter are important to express temporal and spatial validity of assertions and further qualifiers. Ascent combines open information extraction with judicious cleaning using language models. Intrinsic evaluation shows the superior size and quality of the Ascent KB, and an extrinsic evaluation for QA-support tasks underlines the benefits of Ascent.Comment: Web interface available at https://ascent.mpi-inf.mpg.d

arXiv.org e-Print Archive

MPG.PuRe

A Semantic Framework for the Analysis of Privacy Policies

Author: Reidenberg Joel R.
Publication venue: FLASH: The Fordham Law Archive of Scholarship and History
Publication date: 01/01/2018
Field of study

bepress Legal Repository

Fordham University School of Law

Commonsense Properties from Query Logs and Question Answering Forums

Author: Pal K.
Pan J.
Razniewski S.
Romero J.
Sakhadeo A.
Weikum G.
Publication venue
Publication date: 01/01/2019
Field of study

Commonsense knowledge about object properties, human behavior and general concepts is crucial for robust AI applications. However, automatic acquisition of this knowledge is challenging because of sparseness and bias in online sources. This paper presents Quasimodo, a methodology and tool suite for distilling commonsense properties from non-standard web sources. We devise novel ways of tapping into search-engine query logs and QA forums, and combining the resulting candidate assertions with statistical cues from encyclopedias, books and image tags in a corroboration step. Unlike prior work on commonsense knowledge bases, Quasimodo focuses on salient properties that are typically associated with certain objects or concepts. Extensive evaluations, including extrinsic use-case studies, show that Quasimodo provides better coverage than state-of-the-art baselines with comparable quality

MPG.PuRe

Advanced Semantics for Commonsense Knowledge Extraction

Author: Nguyen T.
Razniewski S.
Weikum G.
Publication venue
Publication date: 01/01/2020
Field of study

MPG.PuRe

Large-Scale Multilingual Knowledge Extraction, Publishing and Quality Assessment: The case of DBpedia

Author: Kontokostas Dimitrios
Publication venue
Publication date: 04/09/2018
Field of study

Qucosa - Publikationsserver der Universität Leipzig