Search CORE

18 research outputs found

TEQUILA: Temporal Question Answering over Knowledge Bases

Author: Abujabal A.
Jia Z.
Saha Roy R.
Strötgen J.
Weikum G.
Publication venue
Publication date: 01/01/2019
Field of study

Question answering over knowledge bases (KB-QA) poses challenges in handling complex questions that need to be decomposed into sub-questions. An important case, addressed here, is that of temporal questions, where cues for temporal relations need to be discovered and handled. We present TEQUILA, an enabler method for temporal QA that can run on top of any KB-QA engine. TEQUILA has four stages. It detects if a question has temporal intent. It decomposes and rewrites the question into non-temporal sub-questions and temporal constraints. Answers to sub-questions are then retrieved from the underlying KB-QA engine. Finally, TEQUILA uses constraint reasoning on temporal intervals to compute final answers to the full question. Comparisons against state-of-the-art baselines show the viability of our method

MPG.PuRe

GEI-Digital geht in die zweite Runde : fast 2.000 historische Geschichtsschulbücher online

Author: Chen E.
Klaes S.
Pramme U.
Schubert S.
Strötgen R.
Publication venue
Publication date: 01/01/2012
Field of study

MPG.PuRe

Harmonisation of variables names prior to conducting statistical analyses with multiple datasets: an automated approach

Author: CG Victora
Countdown Coverage Writing Group on behalf of the Countdown to 2015 Core Group
N Ravishankar
R Strötgen
Xavier Bosch-Capblanch
ZA Bhutta
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

ABSTRACT: BACKGROUND: Data requirements by governments, donors and the international community to measure health and development achievements have increased in the last decade. Datasets produced in surveys conducted in several countries and years are often combined to analyse time trends and geographical patterns of demographic and health related indicators. However, since not all datasets have the same structure, variables definitions and codes, they have to be harmonised prior to submitting them to the statistical analyses. Manually searching, renaming and recoding variables are extremely tedious and prone to errors tasks, overall when the number of datasets and variables are large. This article presents an automated approach to harmonise variables names across several datasets, which optimises the search of variables, minimises manual inputs and reduces the risk of error. RESULTS: Three consecutive algorithms are applied iteratively to search for each variable of interest for the analyses in all datasets. The first search (A) captures particular cases that could not be solved in an automated way in the search iterations; the second search (B) is run if search A produced no hits and identifies variables the labels of which contain certain key terms defined by the user. If this search produces no hits, a third one (C) is run to retrieve variables which have been identified in other surveys, as an illustration. For each variable of interest, the outputs of these engines can be (O1) a single best matching variable is found, (O2) more than one matching variable is found or (O3) not matching variables are found. Output O2 is solved by user judgement. Examples using four variables are presented showing that the searches have a 100% sensitivity and specificity after a second iteration. CONCLUSION: Efficient and tested automated algorithms should be used to support the harmonisation process needed to analyse multiple datasets. This is especially relevant when the numbers of datasets or variables to be included are larg

Crossref

Springer - Publisher Connector

edoc

Directory of Open Access Journals

PubMed Central

Diachronic Variation of Temporal Expressions in Scientific Writing Through the Lens of Relative Entropy

Author: D Atkinson
D Biber
D Biber
I Dagan
J Gleick
J Pustejovsky
J Strötgen
JC Meister
JF Allen
N Kanhabua
P Mazur
R Campos
S Degaetano-Ortlieb
S Kullback
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

The abundance of temporal information in documents has lead to an increased interest in processing such information in the NLP community by considering temporal expressions. Besides domain-adaptation, acquiring knowledge on variation of temporal expressions according to time is relevant for improvement in automatic processing. So far, frequency-based accounts dominate in the investigation of specific temporal expressions. We present an approach to investigate diachronic changes of temporal expressions based on relative entropy – with the advantage of using conditioned probabilities rather than mere frequency. While we focus on scientific writing, our approach is generalizable to other domains and interesting not only in the field of NLP, but also in humanities.This work is partially funded by Deutsche Forschungsgemeinschaft (DFG) under grant SFB 1102: Information Density and Linguistic Encoding (www.sfb1102.uni-saarland.de)

Crossref

Universaar

MPG.PuRe

Scientific publications of the Saarland University

All Dates Lead to {R}ome: {E}xtracting and Explaining Temporal References in Street Names

Author: Andrade R.
Strötgen J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

MPG.PuRe

Putting Dates on the Map: {H}arvesting and Analyzing Street Names with Date Mentions and their Explanations

Author: Andrade R.
Gupta D.
Strötgen J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

MPG.PuRe

Combining Topic Models for Corpus Exploration: Applying LDA for Complex Corpus Research Tasks in a Digital Humanities Project

Author: Chang J.
Remus R.
Roberts M. E.
Strötgen R.
Templeton C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2015
Field of study

We investigate new ways of applying LDA topic models: rather than optimizing a single model for a specific use case, we train multiple models based on different parameters and vocabularies which are combined on-the-fly to comply with varying information retrieval tasks. We also show a semi-automatic method which helps users to identify relevant topics across multiple models. Our methods are demonstrated and evaluated on a real-world use case: a large-scale corpus-based digital humanities project called Welt der Kinder (“Children and their World”). We illustrate our approach in that context and show that it can be generalized to other scenarios. We evaluate this work using empirical methods from information retrieval, but also show visualizations and use cases as actually applied in the project

TUbiblio

Crossref

{TempQuestions}: {A} Benchmark for Temporal Question Answering

Author: Abujabal A.
Jia Z.
Saha Roy R.
Strötgen J.
Weikum G.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

MPG.PuRe

{TEQUILA}: {T}emporal Question Answering over Knowledge Bases

Author: Abujabal A.
Jia Z.
Saha Roy R.
Strötgen J.
Weikum G.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Crossref

MPG.PuRe

Metric Spaces for Temporal Information RetrievalAdvances in Information Retrieval

Author: G. Salton
J. Strötgen
M. Verhagen
N. Kanhabua
R. Sibson
R.N. Shepard
R.T. Snodgrass
T. Sakai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Abstract. Documents and queries are rich in temporal features, both at the meta-level and at the content-level. We exploit this information to define temporal scope similarities between documents and queries in metric spaces. Our experiments show that the proposed metrics can be very effective for modeling the relevance for different search tasks, and provide insights into an inherent asymmetry in temporal query semantics. Moreover, we propose a simple ranking model that com-bines the temporal scope similarity with traditional keyword similarities. We experimentally show that it is not worse than traditional keyword-based rankings for non-temporal queries, and that it improves the overall effectiveness for time-based queries.

CiteSeerX

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna