11 research outputs found

    Fine-Grained Entity Typing in Hyperbolic Space

    Full text link
    How can we represent hierarchical information present in large type inventories for entity typing? We study the ability of hyperbolic embeddings to capture hierarchical relations between mentions in context and their target types in a shared vector space. We evaluate on two datasets and investigate two different techniques for creating a large hierarchical entity type inventory: from an expert-generated ontology and by automatically mining type co-occurrences. We find that the hyperbolic model yields improvements over its Euclidean counterpart in some, but not all cases. Our analysis suggests that the adequacy of this geometry depends on the granularity of the type inventory and the way hierarchical relations are inferred.Comment: 12 pages, 4 figures, final version, accepted at the 4th Workshop on Representation Learning for NLP (RepL4NLP), held in conjunction with ACL 201

    Zero-Shot Learning with Common Sense Knowledge Graphs

    Full text link
    Zero-shot learning relies on semantic class representations such as hand-engineered attributes or learned embeddings to predict classes without any labeled examples. We propose to learn class representations from common sense knowledge graphs. Common sense knowledge graphs are an untapped source of explicit high-level knowledge that requires little human effort to apply to a range of tasks. To capture the knowledge in the graph, we introduce ZSL-KG, a general-purpose framework with a novel transformer graph convolutional network (TrGCN) for generating class representations. Our proposed TrGCN architecture computes non-linear combinations of the node neighbourhood and shows improvements on zero-shot learning tasks in language and vision. Our results show ZSL-KG outperforms the best performing graph-based zero-shot learning framework by an average of 2.1 accuracy points with improvements as high as 3.4 accuracy points. Our ablation study on ZSL-KG with alternate graph neural networks shows that our TrGCN adds up to 1.2 accuracy points improvement on these tasks

    Enhancing traditional Chinese medicine diagnostics: Integrating ontological knowledge for multi-label symptom entity classification

    Get PDF
    In traditional Chinese medicine (TCM), artificial intelligence (AI)-assisted syndrome differentiation and disease diagnoses primarily confront the challenges of accurate symptom identification and classification. This study introduces a multi-label entity extraction model grounded in TCM symptom ontology, specifically designed to address the limitations of existing entity recognition models characterized by limited label spaces and an insufficient integration of domain knowledge. This model synergizes a knowledge graph with the TCM symptom ontology framework to facilitate a standardized symptom classification system and enrich it with domain-specific knowledge. It innovatively merges the conventional bidirectional encoder representations from transformers (BERT) + bidirectional long short-term memory (Bi-LSTM) + conditional random fields (CRF) entity recognition methodology with a multi-label classification strategy, thereby adeptly navigating the intricate label interdependencies in the textual data. Introducing a multi-associative feature fusion module is a significant advancement, thereby enabling the extraction of pivotal entity features while discerning the interrelations among diverse categorical labels. The experimental outcomes affirm the model's superior performance in multi-label symptom extraction and substantially elevates the efficiency and accuracy. This advancement robustly underpins research in TCM syndrome differentiation and disease diagnoses

    Knowledge extraction from fictional texts

    Get PDF
    Knowledge extraction from text is a key task in natural language processing, which involves many sub-tasks, such as taxonomy induction, named entity recognition and typing, relation extraction, knowledge canonicalization and so on. By constructing structured knowledge from natural language text, knowledge extraction becomes a key asset for search engines, question answering and other downstream applications. However, current knowledge extraction methods mostly focus on prominent real-world entities with Wikipedia and mainstream news articles as sources. The constructed knowledge bases, therefore, lack information about long-tail domains, with fiction and fantasy as archetypes. Fiction and fantasy are core parts of our human culture, spanning from literature to movies, TV series, comics and video games. With thousands of fictional universes which have been created, knowledge from fictional domains are subject of search-engine queries - by fans as well as cultural analysts. Unlike the real-world domain, knowledge extraction on such specific domains like fiction and fantasy has to tackle several key challenges: - Training data: Sources for fictional domains mostly come from books and fan-built content, which is sparse and noisy, and contains difficult structures of texts, such as dialogues and quotes. Training data for key tasks such as taxonomy induction, named entity typing or relation extraction are also not available. - Domain characteristics and diversity: Fictional universes can be highly sophisticated, containing entities, social structures and sometimes languages that are completely different from the real world. State-of-the-art methods for knowledge extraction make assumptions on entity-class, subclass and entity-entity relations that are often invalid for fictional domains. With different genres of fictional domains, another requirement is to transfer models across domains. - Long fictional texts: While state-of-the-art models have limitations on the input sequence length, it is essential to develop methods that are able to deal with very long texts (e.g. entire books), to capture multiple contexts and leverage widely spread cues. This dissertation addresses the above challenges, by developing new methodologies that advance the state of the art on knowledge extraction in fictional domains. - The first contribution is a method, called TiFi, for constructing type systems (taxonomy induction) for fictional domains. By tapping noisy fan-built content from online communities such as Wikia, TiFi induces taxonomies through three main steps: category cleaning, edge cleaning and top-level construction. Exploiting a variety of features from the original input, TiFi is able to construct taxonomies for a diverse range of fictional domains with high precision. - The second contribution is a comprehensive approach, called ENTYFI, for named entity recognition and typing in long fictional texts. Built on 205 automatically induced high-quality type systems for popular fictional domains, ENTYFI exploits the overlap and reuse of these fictional domains on unseen texts. By combining different typing modules with a consolidation stage, ENTYFI is able to do fine-grained entity typing in long fictional texts with high precision and recall. - The third contribution is an end-to-end system, called KnowFi, for extracting relations between entities in very long texts such as entire books. KnowFi leverages background knowledge from 142 popular fictional domains to identify interesting relations and to collect distant training samples. KnowFi devises a similarity-based ranking technique to reduce false positives in training samples and to select potential text passages that contain seed pairs of entities. By training a hierarchical neural network for all relations, KnowFi is able to infer relations between entity pairs across long fictional texts, and achieves gains over the best prior methods for relation extraction.Wissensextraktion ist ein Schlüsselaufgabe bei der Verarbeitung natürlicher Sprache, und umfasst viele Unteraufgaben, wie Taxonomiekonstruktion, Entitätserkennung und Typisierung, Relationsextraktion, Wissenskanonikalisierung, etc. Durch den Aufbau von strukturiertem Wissen (z.B. Wissensdatenbanken) aus Texten wird die Wissensextraktion zu einem Schlüsselfaktor für Suchmaschinen, Question Answering und andere Anwendungen. Aktuelle Methoden zur Wissensextraktion konzentrieren sich jedoch hauptsächlich auf den Bereich der realen Welt, wobei Wikipedia und Mainstream- Nachrichtenartikel die Hauptquellen sind. Fiktion und Fantasy sind Kernbestandteile unserer menschlichen Kultur, die sich von Literatur bis zu Filmen, Fernsehserien, Comics und Videospielen erstreckt. Für Tausende von fiktiven Universen wird Wissen aus Suchmaschinen abgefragt – von Fans ebenso wie von Kulturwissenschaftler. Im Gegensatz zur realen Welt muss die Wissensextraktion in solchen spezifischen Domänen wie Belletristik und Fantasy mehrere zentrale Herausforderungen bewältigen: • Trainingsdaten. Quellen für fiktive Domänen stammen hauptsächlich aus Büchern und von Fans erstellten Inhalten, die spärlich und fehlerbehaftet sind und schwierige Textstrukturen wie Dialoge und Zitate enthalten. Trainingsdaten für Schlüsselaufgaben wie Taxonomie-Induktion, Named Entity Typing oder Relation Extraction sind ebenfalls nicht verfügbar. • Domain-Eigenschaften und Diversität. Fiktive Universen können sehr anspruchsvoll sein und Entitäten, soziale Strukturen und manchmal auch Sprachen enthalten, die sich von der realen Welt völlig unterscheiden. Moderne Methoden zur Wissensextraktion machen Annahmen über Entity-Class-, Entity-Subclass- und Entity- Entity-Relationen, die für fiktive Domänen oft ungültig sind. Bei verschiedenen Genres fiktiver Domänen müssen Modelle auch über fiktive Domänen hinweg transferierbar sein. • Lange fiktive Texte. Während moderne Modelle Einschränkungen hinsichtlich der Länge der Eingabesequenz haben, ist es wichtig, Methoden zu entwickeln, die in der Lage sind, mit sehr langen Texten (z.B. ganzen Büchern) umzugehen, und mehrere Kontexte und verteilte Hinweise zu erfassen. Diese Dissertation befasst sich mit den oben genannten Herausforderungen, und entwickelt Methoden, die den Stand der Kunst zur Wissensextraktion in fiktionalen Domänen voranbringen. • Der erste Beitrag ist eine Methode, genannt TiFi, zur Konstruktion von Typsystemen (Taxonomie induktion) für fiktive Domänen. Aus von Fans erstellten Inhalten in Online-Communities wie Wikia induziert TiFi Taxonomien in drei wesentlichen Schritten: Kategoriereinigung, Kantenreinigung und Top-Level- Konstruktion. TiFi nutzt eine Vielzahl von Informationen aus den ursprünglichen Quellen und ist in der Lage, Taxonomien für eine Vielzahl von fiktiven Domänen mit hoher Präzision zu erstellen. • Der zweite Beitrag ist ein umfassender Ansatz, genannt ENTYFI, zur Erkennung von Entitäten, und deren Typen, in langen fiktiven Texten. Aufbauend auf 205 automatisch induzierten hochwertigen Typsystemen für populäre fiktive Domänen nutzt ENTYFI die Überlappung und Wiederverwendung dieser fiktiven Domänen zur Bearbeitung neuer Texte. Durch die Zusammenstellung verschiedener Typisierungsmodule mit einer Konsolidierungsphase ist ENTYFI in der Lage, in langen fiktionalen Texten eine feinkörnige Entitätstypisierung mit hoher Präzision und Abdeckung durchzuführen. • Der dritte Beitrag ist ein End-to-End-System, genannt KnowFi, um Relationen zwischen Entitäten aus sehr langen Texten wie ganzen Büchern zu extrahieren. KnowFi nutzt Hintergrundwissen aus 142 beliebten fiktiven Domänen, um interessante Beziehungen zu identifizieren und Trainingsdaten zu sammeln. KnowFi umfasst eine ähnlichkeitsbasierte Ranking-Technik, um falsch positive Einträge in Trainingsdaten zu reduzieren und potenzielle Textpassagen auszuwählen, die Paare von Kandidats-Entitäten enthalten. Durch das Trainieren eines hierarchischen neuronalen Netzwerkes für alle Relationen ist KnowFi in der Lage, Relationen zwischen Entitätspaaren aus langen fiktiven Texten abzuleiten, und übertrifft die besten früheren Methoden zur Relationsextraktion

    Learning Neural Graph Representations in Non-Euclidean Geometries

    Get PDF
    The success of Deep Learning methods is heavily dependent on the choice of the data representation. For that reason, much of the actual effort goes into Representation Learning, which seeks to design preprocessing pipelines and data transformations that can support effective learning algorithms. The aim of Representation Learning is to facilitate the task of extracting useful information for classifiers and other predictor models. In this regard, graphs arise as a convenient data structure that serves as an intermediary representation in a wide range of problems. The predominant approach to work with graphs has been to embed them in an Euclidean space, due to the power and simplicity of this geometry. Nevertheless, data in many domains exhibit non-Euclidean features, making embeddings into Riemannian manifolds with a richer structure necessary. The choice of a metric space where to embed the data imposes a geometric inductive bias, with a direct impact on the performance of the models. This thesis is about learning neural graph representations in non-Euclidean geometries and showcasing their applicability in different downstream tasks. We introduce a toolkit formed by different graph metrics with the goal of characterizing the topology of the data. In that way, we can choose a suitable target embedding space aligned to the shape of the dataset. By virtue of the geometric inductive bias provided by the structure of the non-Euclidean manifolds, neural models can achieve higher performances with a reduced parameter footprint. As a first step, we study graphs with hierarchical structures. We develop different techniques to derive hierarchical graphs from large label inventories. Noticing the capacity of hyperbolic spaces to represent tree-like arrangements, we incorporate this information into an NLP model through hyperbolic graph embeddings and showcase the higher performance that they enable. Second, we tackle the question of how to learn hierarchical representations suited for different downstream tasks. We introduce a model that jointly learns task-specific graph embeddings from a label inventory and performs classification in hyperbolic space. The model achieves state-of-the-art results on very fine-grained labels, with a remarkable reduction of the parameter size. Next, we move to matrix manifolds to work on graphs with diverse structures and properties. We propose a general framework to implement the mathematical tools required to learn graph embeddings on symmetric spaces. These spaces are of particular interest given that they have a compound geometry that simultaneously contains Euclidean as well as hyperbolic subspaces, allowing them to automatically adapt to dissimilar features in the graph. We demonstrate a concrete implementation of the framework on Siegel spaces, showcasing their versatility on different tasks. Finally, we focus on multi-relational graphs. We devise the means to translate Euclidean and hyperbolic multi-relational graph embedding models into the space of symmetric positive definite (SPD) matrices. To do so we develop gyrocalculus in this geometry and integrate it with the aforementioned framework
    corecore