Search CORE

4 research outputs found

Slot Filling

Author: Pink Glen Alan
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 02/08/2017
Field of study

Slot filling (SF) is the task of automatically extracting facts about particular entities from unstructured text, and populating a knowledge base (KB) with these facts. These structured KBs enable applications such as structured web queries and question answering. SF is typically framed as a query-oriented setting of the related task of relation extraction. Throughout this thesis, we reflect on how SF is a task with many distinct problems. We demonstrate that recall is a major limiter on SF system performance. We contribute an analysis of typical SF recall loss, and find a substantial amount of loss occurs early in the SF pipeline. We confirm that accurate NER and coreference resolution are required for high-recall SF. We measure upper bounds using a naïve graph-based semi-supervised bootstrapping technique, and find that only 39% of results are reachable using a typical feature space. We expect that this graph-based technique will be directly useful for extraction, and this leads us to frame SF as a label propagation task. We focus on a detailed graph representation of the task which reflects the behaviour and assumptions we want to model based on our analysis, including modifying the label propagation process to model multiple types of label interaction. Analysing the graph, we find that a large number of errors occur in very close proximity to training data, and identify that this is of major concern for propagation. While there are some conflicts caused by a lack of sufficient disambiguating context—we explore adding additional contextual features to address this—many of these conflicts are caused by subtle annotation problems. We find that lack of a standard for how explicit expressions of relations must be in text makes consistent annotation difficult. Using a strict definition of explicitness results in 20% of correct annotations being removed from a standard dataset. We contribute several annotation-driven analyses of this problem, exploring the definition of slots and the effect of the lack of a concrete definition of explicitness: annotation schema do not detail how explicit expressions of relations need to be, and there is large scope for disagreement between annotators. Additionally, applications may require relatively strict or relaxed evidence for extractions, but this is not considered in annotation tasks. We demonstrate that annotators frequently disagree on instances, dependent on differences in annotator world knowledge and thresholds on making probabilistic inference. SF is fundamental to enabling many knowledge-based applications, and this work motivates modelling and evaluating SF to better target these tasks

Sydney eScholarship

Deep learning methods for knowledge base population

Author: Adel Heike
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 26/06/2018
Field of study

Knowledge bases store structured information about entities or concepts of the world and can be used in various applications, such as information retrieval or question answering. A major drawback of existing knowledge bases is their incompleteness. In this thesis, we explore deep learning methods for automatically populating them from text, addressing the following tasks: slot filling, uncertainty detection and type-aware relation extraction. Slot filling aims at extracting information about entities from a large text corpus. The Text Analysis Conference yearly provides new evaluation data in the context of an international shared task. We develop a modular system to address this challenge. It was one of the top-ranked systems in the shared task evaluations in 2015. For its slot filler classification module, we propose contextCNN, a convolutional neural network based on context splitting. It improves the performance of the slot filling system by 5.0% micro and 2.9% macro F1. To train our binary and multiclass classification models, we create a dataset using distant supervision and reduce the number of noisy labels with a self-training strategy. For model optimization and evaluation, we automatically extract a labeled benchmark for slot filler classification from the manual shared task assessments from 2012-2014. We show that results on this benchmark are correlated with slot filling pipeline results with a Pearson's correlation coefficient of 0.89 (0.82) on data from 2013 (2014). The combination of patterns, support vector machines and contextCNN achieves the best results on the benchmark with a micro (macro) F1 of 51% (53%) on test. Finally, we analyze the results of the slot filling pipeline and the impact of its components. For knowledge base population, it is essential to assess the factuality of the statements extracted from text. From the sentence "Obama was rumored to be born in Kenya", a system should not conclude that Kenya is the place of birth of Obama. Therefore, we address uncertainty detection in the second part of this thesis. We investigate attention-based models and make a first attempt to systematize the attention design space. Moreover, we propose novel attention variants: External attention, which incorporates an external knowledge source, k-max average attention, which only considers the vectors with the k maximum attention weights, and sequence-preserving attention, which allows to maintain order information. Our convolutional neural network with external k-max average attention sets the new state of the art on a Wikipedia benchmark dataset with an F1 score of 68%. To the best of our knowledge, we are the first to integrate an uncertainty detection component into a slot filling pipeline. It improves precision by 1.4% and micro F1 by 0.4%. In the last part of the thesis, we investigate type-aware relation extraction with neural networks. We compare different models for joint entity and relation classification: pipeline models, jointly trained models and globally normalized models based on structured prediction. First, we show that using entity class prediction scores instead of binary decisions helps relation classification. Second, joint training clearly outperforms pipeline models on a large-scale distantly supervised dataset with fine-grained entity classes. It improves the area under the precision-recall curve from 0.53 to 0.66. Third, we propose a model with a structured prediction output layer, which globally normalizes the score of a triple consisting of the classes of two entities and the relation between them. It improves relation extraction results by 4.4% F1 on a manually labeled benchmark dataset. Our analysis shows that the model learns correct correlations between entity and relation classes. Finally, we are the first to use neural networks for joint entity and relation classification in a slot filling pipeline. The jointly trained model achieves the best micro F1 score with a score of 22% while the neural structured prediction model performs best in terms of macro F1 with a score of 25%

Deep learning methods for knowledge base population

Author: Adel Heike
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 26/06/2018
Field of study

Digitale Hochschulschriften der LMU

Recherche d'information et fouille de textes

Author: Bellot Patrice
Grau Brigitte
Publication venue: HAL CCSD
Publication date: 01/01/2014
Field of study

National audienceIntroduction Comprendre un texte est un but que l'Intelligence Artificielle (IA) s'est fixé depuis ses débuts et les premiers travaux apportant des réponses ont vu le jour dans les années 70s. Depuis lors, le thème est toujours d'actualité, bien que les buts et méthodes qu'il recouvre aient considérablement évolués. Il est donc nécessaire de regarder de plus près ce qui se cache derrière cette dénomination générale de « compréhension de texte ». Les premiers travaux, qui ont eu lieu du milieu des années 70 jusqu'au milieu des années 80 [Charniak 1972; Dyer 1983; Schank et al. 1977], étudiaient des textes relatant de courtes histoires et comprendre signifiait mettre en évidence les tenants et aboutissants de l'histoire-les sujets traités, les événements décrits, les relations de causalité les reliant-ainsi que le rôle de chaque personnage, ses motivations et ses intentions. La compréhension était vue comme un processus d'inférence visant à expliciter tout l'implicite présent dans un texte en le retrouvant à partir des connaissances sémantiques et pragmatiques dont disposait la machine. Cela présupposait une modélisation préalable de ces connaissances. On rejoint ici les travaux effectués sur les différents formalismes de représentation des connaissances en IA, décrivant d'une part les sens associés aux mots de la langue (réseaux sémantiques vs logique, et notamment graphes conceptuels [Sowa 1984] et d'autre part les connaissances pragmatiques [Schank 1982]. Tous ces travaux ont montré leur limite dès lors qu'il s'agissait de modéliser manuellement ces connaissances pour tous les domaines, ou de les apprendre automatiquement. Le problème de la compréhension automatique en domaine ouvert restait donc entier. Puisque le problème ainsi posé est insoluble en l'état des connaissances, une approche alternative consiste à le redéfinir et à le décomposer en sous-tâches potentiellement plus faciles à résoudre. Ainsi la compréhension de texte peut être redéfinie selon différents points de vue sur le texte qui permettent de répondre à des besoins spécifiques. De même qu'un lecteur ne lit pas un texte de façon identique selon qu'il veut évaluer sa pertinence par rapport à un thème qui l'intéresse (tâche de type recherche documentaire), qu'il veut classer des documents, prendre connaissances des événements relatés ou rechercher une information précise, de même les processus automatiques seront multiples et s'intéresseront à des aspects différents du texte en fonction de la tâche visée. Suivant le type de connaissance cherché dans un document, le lecteur n'extraira du texte que l'information qui l'intéresse et s'appuiera pour cela sur les indices et sur les connaissances qui lui permettent de réaliser sa tâche de lecture, et donc de compréhension, sans avoir à tout assimiler. On peut alors parler de compréhension à niveaux variables, qui va permettre d'accéder à des niveaux de sens différents. Cette démarche est bien illustrée par les travaux en extraction d'information, évalués dans le cadre des conférences MUC [Grishman and Sundheim 1996], qui ont eu lieu de la fin des années 1980 jusqu'en 1998. L'extraction d'information consistait alors à modéliser un besoin d'information par un patron, décrit par un ensemble d'attributs typés, et à chercher à remplir ces attributs selon l'information contenue dans les textes. C'est ainsi que se sont notamment développées les recherches sur les « entités nommées » (à savoir le repérage de noms de personne, d'organisation, de lieu, de date, etc.) et sur les relations entre ces entités. C'est aussi dans cette optique que se sont développées les approches se situant au niveau du document, que ce soit pour la recherche d'information ou pour en déterminer la structur

HAL AMU