Search CORE

9 research outputs found

Populating knowledge bases with temporal information

Author: Kuzey Erdal
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2016
Field of study

Recent progress in information extraction has enabled the automatic construction of large knowledge bases. Knowledge bases contain millions of entities (e.g. persons, organizations, events, etc.), their semantic classes, and facts about them. Knowledge bases have become a great asset for semantic search, entity linking, deep analytics, and question answering. However, a common limitation of current knowledge bases is the poor coverage of temporal knowledge. First of all, so far, knowledge bases have focused on popular events and ignored long tail events such as political scandals, local festivals, or protests. Secondly, they do not cover the textual phrases denoting events and temporal facts at all. The goal of this dissertation, thus, is to automatically populate knowledge bases with this kind of temporal knowledge. The dissertation makes the following contributions to address the afore mentioned limitations. The first contribution is a method for extracting events from news articles. The method reconciles the extracted events into canonicalized representations and organizes them into fine-grained semantic classes. The second contribution is a method for mining the textual phrases denoting the events and facts. The method infers the temporal scopes of these phrases and maps them to a knowledge base. Our experimental evaluations demonstrate that our methods yield high quality output compared to state-of- the-art approaches, and can indeed populate knowledge bases with temporal knowledge.Der Fortschritt in der Informationsextraktion ermöglicht heute das automatischen Erstellen von Wissensbasen. Derartige Wissensbasen enthalten Entitäten wie Personen, Organisationen oder Events sowie Informationen über diese und deren semantische Klasse. Automatisch generierte Wissensbasen bilden eine wesentliche Grundlage für das semantische Suchen, das Verknüpfen von Entitäten, die Textanalyse und für natürlichsprachliche Frage-Antwortsysteme. Eine Schwäche aktueller Wissensbasen ist jedoch die unzureichende Erfassung von temporalen Informationen. Wissenbasen fokussieren in erster Linie auf populäre Events und ignorieren weniger bekannnte Events wie z.B. politische Skandale, lokale Veranstaltungen oder Demonstrationen. Zudem werden Textphrasen zur Bezeichung von Events und temporalen Fakten nicht erfasst. Ziel der vorliegenden Arbeit ist es, Methoden zu entwickeln, die temporales Wissen au- tomatisch in Wissensbasen integrieren. Dazu leistet die Dissertation folgende Beiträge: 1. Die Entwicklung einer Methode zur Extrahierung von Events aus Nachrichtenartikeln sowie deren Darstellung in einer kanonischen Form und ihrer Einordnung in detaillierte semantische Klassen. 2. Die Entwicklung einer Methode zur Gewinnung von Textphrasen, die Events und Fakten in Wissensbasen bezeichnen sowie einer Methode zur Ableitung ihres zeitlichen Verlaufs und ihrer Dauer. Unsere Experimente belegen, dass die von uns entwickelten Methoden zu qualitativ deutlich besseren Ausgabewerten führen als bisherige Verfahren und Wissensbasen tatsächlich um temporales Wissen erweitern können

Universaar

Acronym

MPG.PuRe

Extraction of Temporal Facts and Events from Wikipedia

Author: Erdal Kuzey
Gerhard Weikum
Publication venue
Publication date: 01/01/2012
Field of study

Recently, large-scale knowledge bases have been constructed by automatically extracting relational facts from text. Unfortunately, most of the current knowledge bases focus on static facts and ignore the temporal dimension. However, the vast majority of facts are evolving with time or are valid only during a particular time period. Thus, time is a significant dimension that should be included in knowledge bases. In this paper, we introduce a complete information extraction framework that harvests temporal facts and events from semi-structured data and free text of Wikipedia articles to create a temporal ontology. First, we extend a temporal data representation model by making it aware of events. Second, we develop an information extraction method which harvests temporal facts and events from Wikipedia infoboxes, categories, lists, and article titles in order to build a temporal knowledge base. Third, we show how the system can use its extracted knowledge for further growing the knowledge base. We demonstrate the effectiveness of our proposed methods through several experiments. We extracted more than one million temporal facts with precision over 90 % for extraction from semi-structured data and almost 70 % for extraction from text

CiteSeerX

Crossref

MPG.PuRe

EVIN: building a knowledge base of events

Author: Kuzey Erdal
Weikum Gerhard
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

We present EVIN: a system that extracts named events from news articles, reconciles them into canonicalized events, and organizes them into semantic classes to populate a knowl-edge base. EVIN exploits different kinds of similarity mea-sures among news, referring to textual contents, entity oc-currences, and temporal ordering. These similarities are captured in a multi-view attributed graph. To distill canon-icalized events, EVIN coarsens the graph by iterative merg-ing based on a judiciously designed loss function. To infer semantic classes of events, EVIN uses statistical language models. EVIN provides a GUI that allows users to query the constructed knowledge base of events, and to explore it in a visual manner

CiteSeerX

CISPA – Helmholtz-Zentrum für Informationssicherheit

MPG.PuRe

Inside YAGO2s: A Transparent Information Extraction Architecture

Author: Erdal Kuzey
Fabian M. Suchanek
Joanna Biega
Publication venue
Publication date: 01/01/2013
Field of study

YAGO[9, 6] is one of the largest public ontologies constructed by information extraction. In a recent refactoring called YAGO2s, the system has been given a modular and completely transparent architecture. In this demo, users can see how more than 30 individual modules of YAGO work in parallel to extract facts, to check facts for their correctness, to deduce facts, and to merge facts from different sources. A GUI allows users to play with different input files, to trace the provenance of individual facts to their sources, to change deduction rules, and to run individual extractors. Users can see step by step how the extractors work together to combine the individual facts to the coherent whole of the YAGO ontology

CiteSeerX

MPG.PuRe

As Time Goes By: Comprehensive Tagging of Textual Phrases with Temporal Scopes

Author: Kuzey Erdal
Setty Vinay
Strötgen Jannik
Weikum Gerhard
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

CISPA – Helmholtz-Zentrum für Informationssicherheit

MPG.PuRe

Temponym Tagging: Temporal Scopes for Textual Phrases

Author: Kuzey Erdal
Setty Vinay
Strötgen Jannik
Weikum Gerhard
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

CISPA – Helmholtz-Zentrum für Informationssicherheit

MPG.PuRe

YAGO2s: Modular High-Quality Information Extraction with an Application to Flight Planning

Author: Edwin Lewis-kelham
Erdal Kuzey
Fabian M. Suchanek
Johannes Hoffart
Publication venue
Publication date: 01/01/2013
Field of study

Abstract: In this paper, we present YAGO2s, the new edition of the YAGO ontology [SKW07, HSBW12]. The software architecture has been refactored from scratch, yielding a design that modularizes both code and data. This modularization enables us to add in new data sources more easily, while still maintaining the high accuracy and coherence of the ontology. Thus, we believe that YAGO2s occupies a sweetspot between a centralized design and a completely distributed design. In this demo, we present an application of this design to the task of planning a flight. Our proposed system finds flights between all airports close to the departure city to all airports close to the destination city. 1 Knowledge Base Construction In recent years, many projects have successfully created large-scale knowledge bases (KBs) in an automated fashion. The KBs contain millions of entities (such as rivers, universities, people, and movies), and millions of facts about them (such as who acted in which movie, which river is located in which country, etc.). There are several strategies to build such KBs. One strategy is to accumulate and reconcile as much data as possible

CiteSeerX

MPG.PuRe

YAGO: A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames

Author: Biega Joanna
Hoffart Johannes
Kuzey Erdal
Rebele Thomas
Suchanek Fabian M
Weikum Gerhard
Publication venue
Publication date: 01/01/2016
Field of study

International audienc

CISPA – Helmholtz-Zentrum für Informationssicherheit

HAL Descartes

MPG.PuRe