3 research outputs found

    Machine question answering with attention-based convolutional neural networks

    Get PDF
    The task of answering textual questions with the help of deep learning techniques is currently an interesting challenge. Although promising results have been achieved in previous works, these approaches leave much room for further considerations and improvements. This thesis deals with the question, how a system can be realized, which is able to capture and process textual contents, and to draw the right conclusions for answering multiple-choice questions with the help of modern methods. For this, current techniques such as convolutional neural networks and attention mechanisms are used and tested on the benchmark datasets MovieQA, WikiQA and InsuranceQA, three corpora with question-answer entries from the domains movies, Wikipedia resp. insurances, each with a slightly different task. The implementation is done using the framework TensorFlow; For the representation of the textual content, pre-trained word vectors of the tool GloVe are used. In addition to improving the system, this work also aims to analyze and evaluate its learning behavior. This is done with the aid of so-called adversarial examples, where by modifying textual context information it is checked whether the neural network concentrates on the correct content when answering a question, and at which degree of manipulation a successful performance of the neural network gets impossible. At the same time, the limitations of such text comprehension systems are shown, which are often able to compare text sequences, but do not develop a deeper understanding of the meaning and content of the inputs. The text comprehension system created in this work achieves a new state-of-the-art for MovieQA with an accuracy of 82.73% correctly answered questions.Die aktuelle Aufgabe, textuelle Fragen maschinell mithilfe von Deep Learning Techniken zu beantworten, stellt derzeit eine interessante Herausforderung dar. Obwohl bereits vielversprechende Erfolge in vorherigen Arbeiten erzielt werden konnten, lassen diese AnsĂ€tze noch viel Raum fĂŒr weitere Überlegungen und Verbesserungen. Diese Arbeit beschĂ€ftigt sich mit der Frage, durch welche modernen Methoden ein System realisiert werden kann, welches in der Lage ist, textuelle Inhalte zu erfassen, zu verarbeiten, und daraus die richtigen SchlĂŒsse zur Beantwortung von Multiple-Choice-Fragen zu ziehen. HierfĂŒr werden aktuelle Techniken wie Convolutional Neural Networks und Attention-Mechanismen verwendet und an den Benchmark-DatensĂ€tzen MovieQA, WikiQA und InsuranceQA getestet, drei Corpora mit Frage-Antwort EintrĂ€gen aus den DomĂ€nen Film, Wikipedia bzw. Versicherungen, die mit jeweils unterschiedlichen Aufgabenstellungen einhergehen. Die Implementierung geschieht mithilfe des Frameworks TensorFlow; FĂŒr die ReprĂ€sentation der textuellen Inhalte werden vortrainierte Wortvektoren des Tools GloVe verwendet. Neben der Verbesserung des Systems verfolgt diese Arbeit zusĂ€tzlich das Ziel, dessen Lernverhalten zu analysieren und zu evaluieren. Dies geschieht unter der Zuhilfenahme sogenannter Adversarial Examples, in welchen durch Modifizierung textueller Kontextinformationen geprĂŒft wird, ob sich das neuronale Netz bei der Beantwortung einer Frage auf die richtigen Inhalte konzentriert und ab welchem Grad der Manipulation eine erfolgreiche Performance des neuronalen Netzes ausbleibt. Hierdurch werden gleichzeitig die Grenzen derartiger TextverstĂ€ndnissysteme aufgezeigt, die zwar oftmals Textsequenzen richtiggehend vergleichen können, jedoch kein tieferes Verstehen von Bedeutung und Inhalt der Eingaben entwickeln. Das im Rahmen dieser Arbeit erstellte TextverstĂ€ndnissystem stellt fĂŒr MovieQA mit einer Treffgenauigkeit von 82.73% richtig beantworteter Fragen den neuesten Stand der Technik dar

    English machine reading comprehension: new approaches to answering multiple-choice questions

    Get PDF
    Reading comprehension is often tested by measuring a person or system’s ability to answer questions about a given text. Machine reading comprehension datasets have proliferated in recent years, particularly for the English language. The aim of this thesis is to investigate and improve data-driven approaches to automatic reading comprehension. Firstly, I provide a full classification of question and answer types for the reading comprehension task. I also present a systematic overview of English reading comprehension datasets (over 50 datasets). I observe that the majority of questions were created using crowdsourcing and the most popular data source is Wikipedia. There is also a lack of why, when, and where questions. Additionally, I address the question “What makes a dataset difficult?” and highlight the difference between datasets created for people and datasets created for machine reading comprehension. Secondly, focusing on multiple-choice question answering, I propose a computationally light method for answer selection based on string similarities and logistic regression. At the time (December 2017), the proposed approach showed the best performance on two datasets (MovieQA and MCQA: IJCNLP 2017 Shared Task 5 Multi-choice Question Answering in Examinations) outperforming some CNN-based methods. Thirdly, I investigate methods for Boolean Reading Comprehension tasks including the use of Knowledge Graph (KG) information for answering questions. I provide an error analysis of a transformer model’s performance on the BoolQ dataset. This reveals several important issues such as unstable model behaviour and some issues with the dataset itself. Experiments with incorporating knowledge graph information into a baseline transformer model do not show a clear improvement due to a combination of the model’s ability to capture new information, inaccuracies in the knowledge graph, and imprecision in entity linking. Finally, I develop a Boolean Reading Comprehension dataset based on spontaneously user-generated questions and reviews which is extremely close to a real-life question-answering scenario. I provide a classification of question difficulty and establish a transformer-based baseline for the new proposed dataset
    corecore