39 research outputs found

    Error propagation

    Get PDF

    Information extraction of +/-effect events to support opinion inference

    Get PDF
    Recently, work in NLP was initiated on a type of opinion inference that arises when opinions are expressed toward events which have positive or negative effects on entities, called +/-effect events. The ultimate goal is to develop a fully automatic system capable of recognizing inferred attitudes. To achieve its results, the inference system requires all instances of +/-effect events. Therefore, this dissertation focuses on +/-effect events to support opinion inference. To extract +/-effect events, we first need the list of +/-effect events. Due to significant sense ambiguity, our goal is to develop a sense-level rather than word-level lexicon. To handle sense-level information, WordNet is adopted. We adopt a graph-based method which is seeded by entries culled from FrameNet and then expanded by exploiting semantic relations in WordNet. We show that WordNet relations are useful for the polarity propagation in the graph model. In addition, to maximize the effectiveness of different types of information, we combine a graph-based method using WordNet relations and a standard classifier using gloss information. Further, we provide evidence that the model is an effective way to guide manual annotation to find +/-effect senses that are not in the seed set. To exploit the sense-level lexicons, we have to carry out word sense disambiguation. We present a knowledge-based +/-effect coarse-grained word sense disambiguation method based on selectional preferences via topic models. For more information, we first group senses, and then utilize topic models to model selectional preferences. Our experiments show that selectional preferences are helpful in our work. To support opinion inferences, we need to identify not only +/-effect events but also their affected entities automatically. Thus, we address both +/-effect event detection and affected entity identification. Since +/-effect events and their affected entities are closely related, instead of a pipeline system, we present a joint model to extract +/-effect events and their affected entities simultaneously. We demonstrate that our joint model is promising to extract +/-effect events and their affected entities jointly

    Commonsense knowledge acquisition and applications

    Get PDF
    Computers are increasingly expected to make smart decisions based on what humans consider commonsense. This would require computers to understand their environment, including properties of objects in the environment (e.g., a wheel is round), relations between objects (e.g., two wheels are part of a bike, or a bike is slower than a car) and interactions of objects (e.g., a driver drives a car on the road). The goal of this dissertation is to investigate automated methods for acquisition of large-scale, semantically organized commonsense knowledge. Prior state-of-the-art methods to acquire commonsense are either not automated or based on shallow representations. Thus, they cannot produce large-scale, semantically organized commonsense knowledge. To achieve the goal, we divide the problem space into three research directions, constituting our core contributions: 1. Properties of objects: acquisition of properties like hasSize, hasShape, etc. We develop WebChild, a semi-supervised method to compile semantically organized properties. 2. Relationships between objects: acquisition of relations like largerThan, partOf, memberOf, etc. We develop CMPKB, a linear-programming based method to compile comparative relations, and, we develop PWKB, a method based on statistical and logical inference to compile part-whole relations. 3. Interactions between objects: acquisition of activities like drive a car, park a car, etc., with attributes such as temporal or spatial attributes. We develop Knowlywood, a method based on semantic parsing and probabilistic graphical models to compile activity knowledge. Together, these methods result in the construction of a large, clean and semantically organized Commonsense Knowledge Base that we call WebChild KB.Von Computern wird immer mehr erwartet, dass sie kluge Entscheidungen treffen können, basierend auf Allgemeinwissen. Dies setzt voraus, dass Computer ihre Umgebung, einschließlich der Eigenschaften von Objekten (z. B. das Rad ist rund), Beziehungen zwischen Objekten (z. B. ein Fahrrad hat zwei RĂ€der, ein Fahrrad ist langsamer als ein Auto) und Interaktionen von Objekten (z. B. ein Fahrer fĂ€hrt ein Auto auf der Straße), verstehen können. Das Ziel dieser Dissertation ist es, automatische Methoden fĂŒr die Erfassung von großmaßstĂ€blichem, semantisch organisiertem Allgemeinwissen zu schaffen. Dies ist schwierig aufgrund folgender Eigenschaften des Allgemeinwissens. Es ist: (i) implizit und spĂ€rlich, da Menschen nicht explizit das Offensichtliche ausdrĂŒcken, (ii) multimodal, da es ĂŒber textuelle und visuelle Inhalte verteilt ist, (iii) beeintrĂ€chtigt vom Einfluss des Berichtenden, da ungewöhnliche Fakten disproportional hĂ€ufig berichtet werden, (iv) KontextabhĂ€ngig, und hat aus diesem Grund eine eingeschrĂ€nkte statistische Konfidenz. Vorherige Methoden, auf diesem Gebiet sind entweder nicht automatisiert oder basieren auf flachen ReprĂ€sentationen. Daher können sie kein großmaßstĂ€bliches, semantisch organisiertes Allgemeinwissen erzeugen. Um unser Ziel zu erreichen, teilen wir den Problemraum in drei Forschungsrichtungen, welche den Hauptbeitrag dieser Dissertation formen: 1. Eigenschaften von Objekten: Erfassung von Eigenschaften wie hasSize, hasShape, usw. Wir entwickeln WebChild, eine halbĂŒberwachte Methode zum Erfassen semantisch organisierter Eigenschaften. 2. Beziehungen zwischen Objekten: Erfassung von Beziehungen wie largerThan, partOf, memberOf, usw. Wir entwickeln CMPKB, eine Methode basierend auf linearer Programmierung um vergleichbare Beziehungen zu erfassen. Weiterhin entwickeln wir PWKB, eine Methode basierend auf statistischer und logischer Inferenz welche zugehörigkeits Beziehungen erfasst. 3. Interaktionen zwischen Objekten: Erfassung von AktivitĂ€ten, wie drive a car, park a car, usw. mit temporalen und rĂ€umlichen Attributen. Wir entwickeln Knowlywood, eine Methode basierend auf semantischem Parsen und probabilistischen grafischen Modellen um AktivitĂ€tswissen zu erfassen. Als Resultat dieser Methoden erstellen wir eine große, saubere und semantisch organisierte Allgemeinwissensbasis, welche wir WebChild KB nennen

    Predicate Matrix: an interoperable lexical knowledge base for predicates

    Get PDF
    183 p.La Matriz de Predicados (Predicate Matrix en inglĂ©s) es un nuevo recurso lĂ©xico-semĂĄntico resultado de la integraciĂłn de mĂșltiples fuentes de conocimiento, entre las cuales se encuentran FrameNet, VerbNet, PropBank y WordNet. La Matriz de Predicados proporciona un lĂ©xico extenso y robusto que permite mejorar la interoperabilidad entre los recursos semĂĄnticos mencionados anteriormente. La creaciĂłn de la Matriz de Predicados se basa en la integraciĂłn de Semlink y nuevos mappings obtenidos utilizando mĂ©todos automĂĄticos que enlazan el conocimiento semĂĄntico a nivel lĂ©xico y de roles. Asimismo, hemos ampliado la Predicate Matrix para cubrir los predicados nominales (inglĂ©s, español) y predicados en otros idiomas (castellano, catalĂĄn y vasco). Como resultado, la Matriz de predicados proporciona un lĂ©xico multilingĂŒe que permite el anĂĄlisis semĂĄntico interoperable en mĂșltiples idiomas

    Commonsense knowledge acquisition and applications

    Get PDF
    Computers are increasingly expected to make smart decisions based on what humans consider commonsense. This would require computers to understand their environment, including properties of objects in the environment (e.g., a wheel is round), relations between objects (e.g., two wheels are part of a bike, or a bike is slower than a car) and interactions of objects (e.g., a driver drives a car on the road). The goal of this dissertation is to investigate automated methods for acquisition of large-scale, semantically organized commonsense knowledge. Prior state-of-the-art methods to acquire commonsense are either not automated or based on shallow representations. Thus, they cannot produce large-scale, semantically organized commonsense knowledge. To achieve the goal, we divide the problem space into three research directions, constituting our core contributions: 1. Properties of objects: acquisition of properties like hasSize, hasShape, etc. We develop WebChild, a semi-supervised method to compile semantically organized properties. 2. Relationships between objects: acquisition of relations like largerThan, partOf, memberOf, etc. We develop CMPKB, a linear-programming based method to compile comparative relations, and, we develop PWKB, a method based on statistical and logical inference to compile part-whole relations. 3. Interactions between objects: acquisition of activities like drive a car, park a car, etc., with attributes such as temporal or spatial attributes. We develop Knowlywood, a method based on semantic parsing and probabilistic graphical models to compile activity knowledge. Together, these methods result in the construction of a large, clean and semantically organized Commonsense Knowledge Base that we call WebChild KB.Von Computern wird immer mehr erwartet, dass sie kluge Entscheidungen treffen können, basierend auf Allgemeinwissen. Dies setzt voraus, dass Computer ihre Umgebung, einschließlich der Eigenschaften von Objekten (z. B. das Rad ist rund), Beziehungen zwischen Objekten (z. B. ein Fahrrad hat zwei RĂ€der, ein Fahrrad ist langsamer als ein Auto) und Interaktionen von Objekten (z. B. ein Fahrer fĂ€hrt ein Auto auf der Straße), verstehen können. Das Ziel dieser Dissertation ist es, automatische Methoden fĂŒr die Erfassung von großmaßstĂ€blichem, semantisch organisiertem Allgemeinwissen zu schaffen. Dies ist schwierig aufgrund folgender Eigenschaften des Allgemeinwissens. Es ist: (i) implizit und spĂ€rlich, da Menschen nicht explizit das Offensichtliche ausdrĂŒcken, (ii) multimodal, da es ĂŒber textuelle und visuelle Inhalte verteilt ist, (iii) beeintrĂ€chtigt vom Einfluss des Berichtenden, da ungewöhnliche Fakten disproportional hĂ€ufig berichtet werden, (iv) KontextabhĂ€ngig, und hat aus diesem Grund eine eingeschrĂ€nkte statistische Konfidenz. Vorherige Methoden, auf diesem Gebiet sind entweder nicht automatisiert oder basieren auf flachen ReprĂ€sentationen. Daher können sie kein großmaßstĂ€bliches, semantisch organisiertes Allgemeinwissen erzeugen. Um unser Ziel zu erreichen, teilen wir den Problemraum in drei Forschungsrichtungen, welche den Hauptbeitrag dieser Dissertation formen: 1. Eigenschaften von Objekten: Erfassung von Eigenschaften wie hasSize, hasShape, usw. Wir entwickeln WebChild, eine halbĂŒberwachte Methode zum Erfassen semantisch organisierter Eigenschaften. 2. Beziehungen zwischen Objekten: Erfassung von Beziehungen wie largerThan, partOf, memberOf, usw. Wir entwickeln CMPKB, eine Methode basierend auf linearer Programmierung um vergleichbare Beziehungen zu erfassen. Weiterhin entwickeln wir PWKB, eine Methode basierend auf statistischer und logischer Inferenz welche zugehörigkeits Beziehungen erfasst. 3. Interaktionen zwischen Objekten: Erfassung von AktivitĂ€ten, wie drive a car, park a car, usw. mit temporalen und rĂ€umlichen Attributen. Wir entwickeln Knowlywood, eine Methode basierend auf semantischem Parsen und probabilistischen grafischen Modellen um AktivitĂ€tswissen zu erfassen. Als Resultat dieser Methoden erstellen wir eine große, saubere und semantisch organisierte Allgemeinwissensbasis, welche wir WebChild KB nennen

    A Survey on Semantic Processing Techniques

    Full text link
    Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

    Harnessing sense-level information for semantically augmented knowledge extraction

    Get PDF
    Nowadays, building accurate computational models for the semantics of language lies at the very core of Natural Language Processing and Artificial Intelligence. A first and foremost step in this respect consists in moving from word-based to sense-based approaches, in which operating explicitly at the level of word senses enables a model to produce more accurate and unambiguous results. At the same time, word senses create a bridge towards structured lexico-semantic resources, where the vast amount of available machine-readable information can help overcome the shortage of annotated data in many languages and domains of knowledge. This latter phenomenon, known as the knowledge acquisition bottlneck, is a crucial problem that hampers the development of large-scale, data-driven approaches for many Natural Language Processing tasks, especially when lexical semantics is directly involved. One of these tasks is Information Extraction, where an effective model has to cope with data sparsity, as well as with lexical ambiguity that can arise at the level of both arguments and relational phrases. Even in more recent Information Extraction approaches where semantics is implicitly modeled, these issues have not yet been addressed in their entirety. On the other hand, however, having access to explicit sense-level information is a very demanding task on its own, which can rarely be performed with high accuracy on a large scale. With this in mind, in ths thesis we will tackle a two-fold objective: our first focus will be on studying fully automatic approaches to obtain high-quality sense-level information from textual corpora; then, we will investigate in depth where and how such sense-level information has the potential to enhance the extraction of knowledge from open text. In the first part of this work, we will explore three different disambiguation scenar- ios (semi-structured text, parallel text, and definitional text) and devise automatic disambiguation strategies that are not only capable of scaling to different corpus sizes and different languages, but that actually take advantage of a multilingual and/or heterogeneous setting to improve and refine their performance. As a result, we will obtain three sense-annotated resources that, when tested experimentally with a baseline system in a series of downstream semantic tasks (i.e. Word Sense Disam- biguation, Entity Linking, Semantic Similarity), show very competitive performances on standard benchmarks against both manual and semi-automatic competitors. In the second part we will instead focus on Information Extraction, with an emphasis on Open Information Extraction (OIE), where issues like sparsity and lexical ambiguity are especially critical, and study how to exploit at best sense-level information within the extraction process. We will start by showing that enforcing a deeper semantic analysis in a definitional setting enables a full-fledged extraction pipeline to compete with state-of-the-art approaches based on much larger (but noisier) data. We will then demonstrate how working at the sense level at the end of an extraction pipeline is also beneficial: indeed, by leveraging sense-based techniques, very heterogeneous OIE-derived data can be aligned semantically, and unified with respect to a common sense inventory. Finally, we will briefly shift the focus to the more constrained setting of hypernym discovery, and study a sense-aware supervised framework for the task that is robust and effective, even when trained on heterogeneous OIE-derived hypernymic knowledge

    Aspects of Coherence for Entity Analysis

    Get PDF
    Natural language understanding is an important topic in natural language proces- sing. Given a text, a computer program should, at the very least, be able to under- stand what the text is about, and ideally also situate it in its extra-textual context and understand what purpose it serves. What exactly it means to understand what a text is about is an open question, but it is generally accepted that, at a minimum, un- derstanding involves being able to answer questions like “Who did what to whom? Where? When? How? And Why?”. Entity analysis, the computational analysis of entities mentioned in a text, aims to support answering the questions “Who?” and “Whom?” by identifying entities mentioned in a text. If the answers to “Where?” and “When?” are specific, named locations and events, entity analysis can also pro- vide these answers. Entity analysis aims to answer these questions by performing entity linking, that is, linking mentions of entities to their corresponding entry in a knowledge base, coreference resolution, that is, identifying all mentions in a text that refer to the same entity, and entity typing, that is, assigning a label such as Person to mentions of entities. In this thesis, we study how different aspects of coherence can be exploited to improve entity analysis. Our main contribution is a method that allows exploiting knowledge-rich, specific aspects of coherence, namely geographic, temporal, and entity type coherence. Geographic coherence expresses the intuition that entities mentioned in a text tend to be geographically close. Similarly, temporal coherence captures the intuition that entities mentioned in a text tend to be close in the tem- poral dimension. Entity type coherence is based in the observation that in a text about a certain topic, such as sports, the entities mentioned in it tend to have the same or related entity types, such as sports team or athlete. We show how to integrate features modeling these aspects of coherence into entity linking systems and esta- blish their utility in extensive experiments covering different datasets and systems. Since entity linking often requires computationally expensive joint, global optimi- zation, we propose a simple, but effective rule-based approach that enjoys some of the benefits of joint, global approaches, while avoiding some of their drawbacks. To enable convenient error analysis for system developers, we introduce a tool for visual analysis of entity linking system output. Investigating another aspect of co- herence, namely the coherence between a predicate and its arguments, we devise a distributed model of selectional preferences and assess its impact on a neural core- ference resolution system. Our final contribution examines how multilingual entity typing can be improved by incorporating subword information. We train and make publicly available subword embeddings in 275 languages and show their utility in a multilingual entity typing tas

    A computational approach to Latin verbs: new resources and methods

    Get PDF
    Questa tesi presenta l'applicazione di metodi computazionali allo studio dei verbi latini. In particolare, mostriamo la creazione di un lessico di sottocategorizzazione estratto automaticamente da corpora annotati; inoltre presentiamo un modello probabilistico per l'acquisizione di preferenze di selezione a partire da corpora annotati e da un'ontologia (Latin WordNet). Infine, descriviamo i risultati di uno studio diacronico e quantitativo sui preverbi spaziali latini
    corecore