319 research outputs found
GECKA3D: A 3D Game Engine for Commonsense Knowledge Acquisition
Commonsense knowledge representation and reasoning is key for tasks such as
artificial intelligence and natural language understanding. Since commonsense
consists of information that humans take for granted, gathering it is an
extremely difficult task. In this paper, we introduce a novel 3D game engine
for commonsense knowledge acquisition (GECKA3D) which aims to collect
commonsense from game designers through the development of serious games.
GECKA3D integrates the potential of serious games and games with a purpose.
This provides a platform for the acquisition of re-usable and multi-purpose
knowledge, and also enables the development of games that can provide
entertainment value and teach players something meaningful about the actual
world they live in
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
We present a new kind of question answering dataset, OpenBookQA, modeled
after open book exams for assessing human understanding of a subject. The open
book that comes with our questions is a set of 1329 elementary level science
facts. Roughly 6000 questions probe an understanding of these facts and their
application to novel situations. This requires combining an open book fact
(e.g., metals conduct electricity) with broad common knowledge (e.g., a suit of
armor is made of metal) obtained from other sources. While existing QA datasets
over documents or knowledge bases, being generally self-contained, focus on
linguistic understanding, OpenBookQA probes a deeper understanding of both the
topic---in the context of common knowledge---and the language it is expressed
in. Human performance on OpenBookQA is close to 92%, but many state-of-the-art
pre-trained QA methods perform surprisingly poorly, worse than several simple
neural baselines we develop. Our oracle experiments designed to circumvent the
knowledge retrieval bottleneck demonstrate the value of both the open book and
additional facts. We leave it as a challenge to solve the retrieval problem in
this multi-hop setting and to close the large gap to human performance.Comment: Published as conference long paper at EMNLP 201
A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics, and Benchmark Datasets
Machine Reading Comprehension (MRC) is a challenging NLP research field with
wide real world applications. The great progress of this field in recent years
is mainly due to the emergence of large-scale datasets and deep learning. At
present, a lot of MRC models have already surpassed the human performance on
many datasets despite the obvious giant gap between existing MRC models and
genuine human-level reading comprehension. This shows the need of improving
existing datasets, evaluation metrics and models to move the MRC models toward
'real' understanding. To address this lack of comprehensive survey of existing
MRC tasks, evaluation metrics and datasets, herein, (1) we analyzed 57 MRC
tasks and datasets; proposed a more precise classification method of MRC tasks
with 4 different attributes (2) we summarized 9 evaluation metrics of MRC tasks
and (3) 7 attributes and 10 characteristics of MRC datasets; (4) We also
discussed some open issues in MRC research and highlight some future research
directions. In addition, to help the community, we have collected, organized,
and published our data on a companion website(https://mrc-datasets.github.io/)
where MRC researchers could directly access each MRC dataset, papers, baseline
projects and browse the leaderboard.Comment: 59 page
GECKA3D: A 3D Game Engine for Commonsense Knowledge Acquisition
Commonsense knowledge representation and reasoning is key for tasks such as artificial intelligence and natural language understanding. Since commonsense consists of information that humans take for granted, gathering it is an extremely difficult task. In this paper, we introduce a novel 3D game engine for commonsense knowledge acquisition (GECKA3D) which aims to collect commonsense from game designers through the development of serious games. GECKA3D integrates the potential of serious games and games with a purpose. This provides a platform for the acquisition of reusable and multi-purpose knowledge and also enables the development of games that can provide entertainment value and teach players something meaningful about the actual world they live in
Text-image synergy for multimodal retrieval and annotation
Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text und Bild sind die beiden häufigsten Arten von Inhalten im Internet. Während es für Menschen einfach ist, gerade aus dem Zusammenspiel von Text- und Bildinhalten Informationen zu erfassen, stellt diese kombinierte Darstellung von Inhalten Softwaresysteme vor große Herausforderungen. In dieser Dissertation werden Probleme studiert, für deren Lösung das Verständnis des Zusammenspiels von Text- und Bildinhalten wesentlich ist. Es werden Methoden und Vorschläge präsentiert und empirisch bewertet, die semantische Verbindungen zwischen Text und Bild in multimodalen Daten herstellen. Wir stellen in dieser Dissertation vier miteinander verbundene Text- und Bildprobleme vor: • Bildersuche. Ob Bilder anhand von textbasierten Suchanfragen gefunden werden, hängt stark davon ab, ob der Text in der Nähe des Bildes mit dem der Anfrage übereinstimmt. Bilder ohne textuellen Kontext, oder sogar mit thematisch passendem Kontext, aber ohne direkte Übereinstimmungen der vorhandenen Schlagworte zur Suchanfrage, können häufig nicht gefunden werden. Zur Abhilfe schlagen wir vor, drei Arten von Informationen in Kombination zu nutzen: visuelle Informationen (in Form von automatisch generierten Bildbeschreibungen), textuelle Informationen (Stichworte aus vorangegangenen Suchanfragen), und Alltagswissen. • Verbesserte Bildbeschreibungen. Bei der Objekterkennung durch Computer Vision kommt es des Öfteren zu Fehldetektionen und Inkohärenzen. Die korrekte Identifikation von Bildinhalten ist jedoch eine wichtige Voraussetzung für die Suche nach Bildern mittels textueller Suchanfragen. Um die Fehleranfälligkeit bei der Objekterkennung zu minimieren, schlagen wir vor Alltagswissen einzubeziehen. Durch zusätzliche Bild-Annotationen, welche sich durch den gesunden Menschenverstand als thematisch passend erweisen, können viele fehlerhafte und zusammenhanglose Erkennungen vermieden werden. • Bild-Text Platzierung. Auf Internetseiten mit Text- und Bildinhalten (wie Nachrichtenseiten, Blogbeiträge, Artikel in sozialen Medien) werden Bilder in der Regel an semantisch sinnvollen Positionen im Textfluss platziert. Wir nutzen dies um ein Framework vorzuschlagen, in dem relevante Bilder ausgesucht werden und mit den passenden Abschnitten eines Textes assoziiert werden. • Bildunterschriften. Bilder, die als Teil von multimodalen Inhalten zur Verbesserung der Lesbarkeit von Texten dienen, haben typischerweise Bildunterschriften, die zum Kontext des umgebenden Texts passen. Wir schlagen vor, den Kontext beim automatischen Generieren von Bildunterschriften ebenfalls einzubeziehen. Üblicherweise werden hierfür die Bilder allein analysiert. Wir stellen die kontextbezogene Bildunterschriftengenerierung vor. Unsere vielversprechenden Beobachtungen und Ergebnisse eröffnen interessante Möglichkeiten für weitergehende Forschung zur computergestützten Erfassung des Zusammenspiels von Text- und Bildinhalten
Reasoning-Driven Question-Answering For Natural Language Understanding
Natural language understanding (NLU) of text is a fundamental challenge in AI, and it has received significant attention throughout the history of NLP research. This primary goal has been studied under different tasks, such as Question Answering (QA) and Textual Entailment (TE). In this thesis, we investigate the NLU problem through the QA task and focus on the aspects that make it a challenge for the current state-of-the-art technology. This thesis is organized into three main parts:
In the first part, we explore multiple formalisms to improve existing machine comprehension systems. We propose a formulation for abductive reasoning in natural language and show its effectiveness, especially in domains with limited training data. Additionally, to help reasoning systems cope with irrelevant or redundant information, we create a supervised approach to learn and detect the essential terms in questions.
In the second part, we propose two new challenge datasets. In particular, we create two datasets of natural language questions where (i) the first one requires reasoning over multiple sentences; (ii) the second one requires temporal common sense reasoning. We hope that the two proposed datasets will motivate the field to address more complex problems.
In the final part, we present the first formal framework for multi-step reasoning algorithms,
in the presence of a few important properties of language use, such as incompleteness, ambiguity, etc. We apply this framework to prove fundamental limitations for reasoning algorithms. These theoretical results provide extra intuition into the existing empirical evidence in the field
Online Handbook of Argumentation for AI: Volume 1
This volume contains revised versions of the papers selected for the first
volume of the Online Handbook of Argumentation for AI (OHAAI). Previously,
formal theories of argument and argument interaction have been proposed and
studied, and this has led to the more recent study of computational models of
argument. Argumentation, as a field within artificial intelligence (AI), is
highly relevant for researchers interested in symbolic representations of
knowledge and defeasible reasoning. The purpose of this handbook is to provide
an open access and curated anthology for the argumentation research community.
OHAAI is designed to serve as a research hub to keep track of the latest and
upcoming PhD-driven research on the theory and application of argumentation in
all areas related to AI.Comment: editor: Federico Castagna and Francesca Mosca and Jack Mumford and
Stefan Sarkadi and Andreas Xydi
- …