Search CORE

6 research outputs found

Recommended from our members

The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge

Author: Auer Sören
Barone Dante A.C.
Bartz Cassiano
Cortes Eduardo G.
Jaradeh Mohamad Yaser
Karras Oliver
Koubarakis Manolis
Mouromtsev Dmitry
Pliukhin Dmitrii
Radyush Daniil
Shilin Ivan
Stocker Markus
Tsalapati Eleni
Publication venue: London : Nature Publishing Group
Publication date: 01/01/2023
Field of study

Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge

Repositorium für Naturwissenschaften und Technik

The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge

Author: Auer Sören
Barone Dante A. C.
Bartz Cassiano
Cortes Eduardo G.
Jaradeh Mohamad Yaser
Karras Oliver
Koubarakis Manolis
Mouromtsev Dmitry
Pliukhin Dmitrii
Radyush Daniil
Shilin Ivan
Stocker Markus
Tsalapati Eleni
Publication venue: [London] : Macmillan Publishers Limited, part of Springer Nature
Publication date: 01/01/2023
Field of study

Institutionelles Repositorium der Leibniz Universität Hannover

A dialogue based mobile virtual assistant for tourists: The SpaceBook Project

Author: Alce
Anna Dickinson
Bartie
Bartie
Benford
Brewster
Brewster
Bridwell
Cahill
Cai
Carroll
Carswell
Chincholle
Chung
Davies
Duckham
Duckham
Dünser
Egenhofer
Espinoza
Fantino
Gartner
Gittings
Golledge
Golledge
Gu
Heuten
Janarthanam
Janarthanam
Janarthanam
Janarthanam
Kelley
Lemon
Lemon
Li
Liarokapis
Long
Loomis
Mattos
May
May
McTear
Meek
Meng Yu
Mikhailian
Miller
Montello
Mountain
Narzt
Oliver Lemon
Phil Bartie
Raper
Richter
Robin L. Hill
Saksamudre
Sorrows
Srini Janarthanam
Strassman
Tiphaine Dalmas
Vlachos
William Mackaness
Woodsend
Xingkun Liu
Young
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Ubiquitous mobile computing offers innovative approaches in the delivery of information that can facilitate free roaming of the city, informing and guiding the tourist as the city unfolds before them. However making frequent visual reference to mobile devices can be distracting, the user having to interact via a small screen thus disrupting the explorative experience. This research reports on an EU funded project, SpaceBook, that explored the utility of a hands-free, eyes-free virtual tour guide, that could answer questions through a spoken dialogue user interface and notify the user of interesting features in view while guiding the tourist to various destinations. Visibility modelling was carried out in real-time based on a LiDAR sourced digital surface model, fused with a variety of map and crowd sourced datasets (e.g. Ordnance Survey, OpenStreetMap, Flickr, Foursquare) to establish the most interesting landmarks visible from the user's location at any given moment. A number of variations of the SpaceBook system were trialled in Edinburgh (Scotland). The research highlighted the pleasure derived from this novel form of interaction and revealed the complexity of prioritising route guidance instruction alongside identification, description and embellishment of landmark information – there being a delicate balance between the level of information ‘pushed’ to the user, and the user's requests for further information. Among a number of challenges, were issues regarding the fidelity of spatial data and positioning information required for pedestrian based systems – the pedestrian having much greater freedom of movement than vehicles

Heriot Watt Pure

Crossref

Stirling Online Research Repository (RIOXX)

Edinburgh Research Explorer

Stirling Online Research Repository

Thinking outside the graph: scholarly knowledge graph construction leveraging natural language processing

Author: Jaradeh Mohamad Yaser
Publication venue: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
Publication date: 23/11/2022
Field of study

Despite improved digital access to scholarly knowledge in recent decades, scholarly communication remains exclusively document-based. The document-oriented workflows in science publication have reached the limits of adequacy as highlighted by recent discussions on the increasing proliferation of scientific literature, the deficiency of peer-review and the reproducibility crisis. In this form, scientific knowledge remains locked in representations that are inadequate for machine processing. As long as scholarly communication remains in this form, we cannot take advantage of all the advancements taking place in machine learning and natural language processing techniques. Such techniques would facilitate the transformation from pure text based into (semi-)structured semantic descriptions that are interlinked in a collection of big federated graphs. We are in dire need for a new age of semantically enabled infrastructure adept at storing, manipulating, and querying scholarly knowledge. Equally important is a suite of machine assistance tools designed to populate, curate, and explore the resulting scholarly knowledge graph. In this thesis, we address the issue of constructing a scholarly knowledge graph using natural language processing techniques. First, we tackle the issue of developing a scholarly knowledge graph for structured scholarly communication, that can be populated and constructed automatically. We co-design and co-implement the Open Research Knowledge Graph (ORKG), an infrastructure capable of modeling, storing, and automatically curating scholarly communications. Then, we propose a method to automatically extract information into knowledge graphs. With Plumber, we create a framework to dynamically compose open information extraction pipelines based on the input text. Such pipelines are composed from community-created information extraction components in an effort to consolidate individual research contributions under one umbrella. We further present MORTY as a more targeted approach that leverages automatic text summarization to create from the scholarly article's text structured summaries containing all required information. In contrast to the pipeline approach, MORTY only extracts the information it is instructed to, making it a more valuable tool for various curation and contribution use cases. Moreover, we study the problem of knowledge graph completion. exBERT is able to perform knowledge graph completion tasks such as relation and entity prediction tasks on scholarly knowledge graphs by means of textual triple classification. Lastly, we use the structured descriptions collected from manual and automated sources alike with a question answering approach that builds on the machine-actionable descriptions in the ORKG. We propose JarvisQA, a question answering interface operating on tabular views of scholarly knowledge graphs i.e., ORKG comparisons. JarvisQA is able to answer a variety of natural language questions, and retrieve complex answers on pre-selected sub-graphs. These contributions are key in the broader agenda of studying the feasibility of natural language processing methods on scholarly knowledge graphs, and lays the foundation of which methods can be used on which cases. Our work indicates what are the challenges and issues with automatically constructing scholarly knowledge graphs, and opens up future research directions

Institutionelles Repositorium der Leibniz Universität Hannover

Cross-lingual question answering

Author: Sacaleanu Bogdan Eugen
Publication venue: Fakultät 4 - Philosophische Fakultät II. Fachrichtung 4.7 - Allgemeine Linguistik
Publication date: 01/01/2012
Field of study

Question Answering has become an intensively researched area in the last decade, being seen as the next step beyond Information Retrieval in the attempt to provide more concise and better access to large volumes of available information. Question Answering builds on Information Retrieval technology for a first touch of possible relevant data and uses further natural language processing techniques to search for candidate answers and to look for clues that accept or invalidate the candidates as right answers to the question. Though most of the research has been carried out in monolingual settings, where the question and the answer-bearing documents share the same natural language, current approaches concentrate on cross-language scenarios, where the question and the documents are in different languages. Known in this context and common with the Information Retrieval research are three methods of crossing the language barrier: by translating the question, by translating the documents or by aligning both the question and the documents to a common inter-lingual representation. We present a cross-lingual English to German Question Answering system, for both factoid and definition questions, using a German monolingual system and translating the questions from English to German. Two different techniques of translation are evaluated: • direct translation of the English input question into German and • transfer-based translation, by using an intermediate representation that captures the “meaning” of the original question and is translated into the target language. For both translation techniques two types of translation tools are used: bilingual dictionaries and machine translation. The intermediate representation captures the semantic meaning of the question in terms of Question Type (QType), Expected Answer Type (EAType) and Focus, information that steers the workflow of the question answering process. The German monolingual Question Answering system can answer both factoid and definition questions and is based on several premises: • facts and definitions are usually expressed locally at the level of a sentence and its surroundings; • proximity of concepts within a sentence can be related to their semantic dependency; • for factoid questions, redundancy of candidate answers is a good indicator of their suitability; • definitions of concepts are expressed using fixed linguistic structures such as appositions, modifiers, and abbreviation extensions. Extensive evaluations of the monolingual system have shown that the above mentioned hypothesis holds true in most of the cases when dealing with a fairly large collection of documents, like the one used in the CLEF evaluation forum.Innerhalb der letzten zehn Jahre hat sich Question Answering zu einem intensiv erforschten Themengebiet gewandelt, es stellt den nächsten Schritt des Information Retrieval dar, mit dem Bestreben einen präziseren Zugang zu großen Datenbeständen von verfügbaren Informationen bereitzustellen. Das Question Answering setzt auf die Information Retrieval-Technologie, um mögliche relevante Daten zu suchen, kombiniert mit weiteren Techniken zur Verarbeitung von natürlicher Sprache, um mögliche Antwortkandidaten zu identifizieren und diese anhand von Hinweisen oder Anhaltspunkten entsprechend der Frage als richtige Antwort zu akzeptieren oder als unpassend zu erklären. Während ein Großteil der Forschung den einsprachigen Kontext voraussetzt, wobei Frage- und Antwortdokumente ein und dieselbe Sprache teilen, konzentrieren sich aktuellere Ansätze auf sprachübergreifende Szenarien, in denen die Frage- und Antwortdokumente in unterschiedlichen Sprachen vorliegen. Im Kontext des Information Retrieval existieren drei bekannte Ansätze, die versuchen auf unterschiedliche Art und Weise die Sprachbarriere zu überwinden: durch die Übersetzung der Frage, durch die Übersetzung der Dokumente oder durch eine Angleichung von sowohl der Frage als auch der Dokumente zu einer gemeinsamen interlingualen Darstellung. Wir präsentieren ein sprachübergreifendes Question Answering System vom Englischen ins Deutsche, das sowohl für Faktoid- als auch für Definitionsfragen funktioniert. Dazu verwenden wir ein einsprachiges deutsches System und übersetzen die Fragen vom Englischen ins Deutsche. Zwei unterschiedliche Techniken der Übersetzung werden untersucht: • die direkte Übersetzung der englischen Fragestellung ins Deutsche und • die Abbildungs-basierte Übersetzung, die eine Zwischendarstellung verwendet, um die „Semantik“ der ursprünglichen Frage zu erfassen und in die Zielsprache zu übersetzen. Für beide aufgelisteten Übersetzungstechniken werden zwei Übersetzungsquellen verwendet: zweisprachige Wörterbücher und maschinelle Übersetzung. Die Zwischendarstellung erfasst die Semantik der Frage in Bezug auf die Art der Frage (QType), den erwarteten Antworttyp (EAType) und Fokus, sowie die Informationen, die den Ablauf des Frage-Antwort-Prozesses steuern. Das deutschsprachige Question Answering System kann sowohl Faktoid- als auch Definitionsfragen beantworten und basiert auf mehreren Prämissen: • Fakten und Definitionen werden in der Regel lokal auf Satzebene ausgedrückt; • Die Nähe von Konzepten innerhalb eines Satzes kann auf eine semantische Verbindung hinweisen; • Bei Faktoidfragen ist die Redundanz der Antwortkandidaten ein guter Indikator für deren Eignung; • Definitionen von Begriffen werden mit festen sprachlichen Strukturen ausgedrückt, wie Appositionen, Modifikatoren, Abkürzungen und Erweiterungen. Umfangreiche Auswertungen des einsprachigen Systems haben gezeigt, dass die oben genannten Hypothesen in den meisten Fällen wahr sind, wenn es um eine ziemlich große Sammlung von Dokumenten geht, wie bei der im CLEF Evaluationsforum verwendeten Version