39 research outputs found
Information extraction of +/-effect events to support opinion inference
Recently, work in NLP was initiated on a type of opinion inference that arises when opinions are expressed toward events which have positive or negative effects on entities, called +/-effect events. The ultimate goal is to develop a fully automatic system capable of recognizing inferred attitudes. To achieve its results, the inference system requires all instances of +/-effect events. Therefore, this dissertation focuses on +/-effect events to support opinion inference. To extract +/-effect events, we first need the list of +/-effect events. Due to significant sense ambiguity, our goal is to develop a sense-level rather than word-level lexicon. To handle sense-level information, WordNet is adopted. We adopt a graph-based method which is seeded by entries culled from FrameNet and then expanded by exploiting semantic relations in WordNet. We show that WordNet relations are useful for the polarity propagation in the graph model. In addition, to maximize the effectiveness of different types of information, we combine a graph-based method using WordNet relations and a standard classifier using gloss information. Further, we provide evidence that the model is an effective way to guide manual annotation to find +/-effect senses that are not in the seed set. To exploit the sense-level lexicons, we have to carry out word sense disambiguation. We present a knowledge-based +/-effect coarse-grained word sense disambiguation method based on selectional preferences via topic models. For more information, we first group senses, and then utilize topic models to model selectional preferences. Our experiments show that selectional preferences are helpful in our work. To support opinion inferences, we need to identify not only +/-effect events but also their affected entities automatically. Thus, we address both +/-effect event detection and affected entity identification. Since +/-effect events and their affected entities are closely related, instead of a pipeline system, we present a joint model to extract +/-effect events and their affected entities simultaneously. We demonstrate that our joint model is promising to extract +/-effect events and their affected entities jointly
Commonsense knowledge acquisition and applications
Computers are increasingly expected to make smart decisions based on what humans consider commonsense. This would require computers to understand their environment, including properties of objects in the environment (e.g., a wheel is round), relations between objects (e.g., two wheels are part of a bike, or a bike is slower than a car) and interactions of objects (e.g., a driver drives a car on the road).
The goal of this dissertation is to investigate automated methods for acquisition of large-scale, semantically organized commonsense knowledge. Prior state-of-the-art methods to acquire commonsense are either not automated or based on shallow representations. Thus, they cannot produce large-scale, semantically organized commonsense knowledge.
To achieve the goal, we divide the problem space into three research directions, constituting our core contributions:
1. Properties of objects: acquisition of properties like hasSize, hasShape, etc. We develop WebChild, a semi-supervised method to compile semantically organized properties.
2. Relationships between objects: acquisition of relations like largerThan, partOf, memberOf, etc. We develop CMPKB, a linear-programming based method to compile comparative relations, and, we develop PWKB, a method based on statistical and logical inference to compile part-whole relations.
3. Interactions between objects: acquisition of activities like drive a car, park a car, etc., with attributes such as temporal or spatial attributes. We develop Knowlywood, a method based on semantic parsing and probabilistic graphical models to compile activity knowledge.
Together, these methods result in the construction of a large, clean and semantically organized Commonsense Knowledge Base that we call WebChild KB.Von Computern wird immer mehr erwartet, dass sie kluge Entscheidungen treffen können, basierend auf Allgemeinwissen. Dies setzt voraus, dass Computer ihre Umgebung, einschlieĂlich der Eigenschaften von Objekten (z. B. das Rad ist rund), Beziehungen zwischen Objekten (z. B. ein Fahrrad hat zwei RĂ€der, ein Fahrrad ist langsamer als ein Auto) und Interaktionen von Objekten (z. B. ein Fahrer fĂ€hrt ein Auto auf der StraĂe), verstehen können.
Das Ziel dieser Dissertation ist es, automatische Methoden fĂŒr die Erfassung von groĂmaĂstĂ€blichem, semantisch organisiertem Allgemeinwissen zu schaffen. Dies ist schwierig aufgrund folgender Eigenschaften des Allgemeinwissens. Es ist: (i) implizit und spĂ€rlich, da Menschen nicht explizit das Offensichtliche ausdrĂŒcken, (ii) multimodal, da es ĂŒber textuelle und visuelle Inhalte verteilt ist, (iii) beeintrĂ€chtigt vom Einfluss des Berichtenden, da ungewöhnliche Fakten disproportional hĂ€ufig berichtet werden, (iv) KontextabhĂ€ngig, und hat aus diesem Grund eine eingeschrĂ€nkte statistische Konfidenz.
Vorherige Methoden, auf diesem Gebiet sind entweder nicht automatisiert oder basieren auf flachen ReprĂ€sentationen. Daher können sie kein groĂmaĂstĂ€bliches, semantisch organisiertes Allgemeinwissen erzeugen.
Um unser Ziel zu erreichen, teilen wir den Problemraum in drei Forschungsrichtungen, welche den Hauptbeitrag dieser Dissertation formen:
1. Eigenschaften von Objekten: Erfassung von Eigenschaften wie hasSize, hasShape, usw. Wir entwickeln WebChild, eine halbĂŒberwachte Methode zum Erfassen semantisch organisierter Eigenschaften.
2. Beziehungen zwischen Objekten: Erfassung von Beziehungen wie largerThan, partOf, memberOf, usw. Wir entwickeln CMPKB, eine Methode basierend auf linearer Programmierung um vergleichbare Beziehungen zu erfassen. Weiterhin entwickeln wir PWKB, eine Methode basierend auf statistischer und logischer Inferenz welche zugehörigkeits Beziehungen erfasst.
3. Interaktionen zwischen Objekten: Erfassung von AktivitÀten, wie drive a car, park a car, usw. mit temporalen und rÀumlichen Attributen. Wir entwickeln Knowlywood, eine Methode basierend auf semantischem Parsen und probabilistischen grafischen Modellen um AktivitÀtswissen zu erfassen.
Als Resultat dieser Methoden erstellen wir eine groĂe, saubere und semantisch organisierte Allgemeinwissensbasis, welche wir WebChild KB nennen
Predicate Matrix: an interoperable lexical knowledge base for predicates
183 p.La Matriz de Predicados (Predicate Matrix en inglĂ©s) es un nuevo recurso lĂ©xico-semĂĄntico resultado de la integraciĂłn de mĂșltiples fuentes de conocimiento, entre las cuales se encuentran FrameNet, VerbNet, PropBank y WordNet. La Matriz de Predicados proporciona un lĂ©xico extenso y robusto que permite mejorar la interoperabilidad entre los recursos semĂĄnticos mencionados anteriormente. La creaciĂłn de la Matriz de Predicados se basa en la integraciĂłn de Semlink y nuevos mappings obtenidos utilizando mĂ©todos automĂĄticos que enlazan el conocimiento semĂĄntico a nivel lĂ©xico y de roles. Asimismo, hemos ampliado la Predicate Matrix para cubrir los predicados nominales (inglĂ©s, español) y predicados en otros idiomas (castellano, catalĂĄn y vasco). Como resultado, la Matriz de predicados proporciona un lĂ©xico multilingĂŒe que permite el anĂĄlisis semĂĄntico interoperable en mĂșltiples idiomas
Recommended from our members
Finding Meaning in Context Using Graph Algorithms in Mono- and Cross-lingual Settings
Making computers automatically find the appropriate meaning of words in context is an interesting problem that has proven to be one of the most challenging tasks in natural language processing (NLP). Widespread potential applications of a possible solution to the problem could be envisaged in several NLP tasks such as text simplification, language learning, machine translation, query expansion, information retrieval and text summarization. Ambiguity of words has always been a challenge in these applications, and the traditional endeavor to solve the problem of this ambiguity, namely doing word sense disambiguation using resources like WordNet, has been fraught with debate about the feasibility of the granularity that exists in WordNet senses. The recent trend has therefore been to move away from enforcing any given lexical resource upon automated systems from which to pick potential candidate senses,and to instead encourage them to pick and choose their own resources. Given a sentence with a target ambiguous word, an alternative solution consists of picking potential candidate substitutes for the target, filtering the list of the candidates to a much shorter list using various heuristics, and trying to match these system predictions against a human generated gold standard, with a view to ensuring that the meaning of the sentence does not change after the substitutions. This solution has manifested itself in the SemEval 2007 task of lexical substitution and the more recent SemEval 2010 task of cross-lingual lexical substitution (which I helped organize), where given an English context and a target word within that context, the systems are required to provide between one and ten appropriate substitutes (in English) or translations (in Spanish) for the target word. In this dissertation, I present a comprehensive overview of state-of-the-art research and describe new experiments to tackle the tasks of lexical substitution and cross-lingual lexical substitution. In particular I attempt to answer some research questions pertinent to the tasks, mostly focusing on completely unsupervised approaches. I present a new framework for unsupervised lexical substitution using graphs and centrality algorithms. An additional novelty in this approach is the use of directional similarity rather than the traditional, symmetric word similarity. Additionally, the thesis also explores the extension of the monolingual framework into a cross-lingual one, and examines how well this cross-lingual framework can work for the monolingual lexical substitution and cross-lingual lexical substitution tasks. A comprehensive set of comparative investigations are presented amongst supervised and unsupervised methods, several graph based methods, and the use of monolingual and multilingual information
Commonsense knowledge acquisition and applications
Computers are increasingly expected to make smart decisions based on what humans consider commonsense. This would require computers to understand their environment, including properties of objects in the environment (e.g., a wheel is round), relations between objects (e.g., two wheels are part of a bike, or a bike is slower than a car) and interactions of objects (e.g., a driver drives a car on the road).
The goal of this dissertation is to investigate automated methods for acquisition of large-scale, semantically organized commonsense knowledge. Prior state-of-the-art methods to acquire commonsense are either not automated or based on shallow representations. Thus, they cannot produce large-scale, semantically organized commonsense knowledge.
To achieve the goal, we divide the problem space into three research directions, constituting our core contributions:
1. Properties of objects: acquisition of properties like hasSize, hasShape, etc. We develop WebChild, a semi-supervised method to compile semantically organized properties.
2. Relationships between objects: acquisition of relations like largerThan, partOf, memberOf, etc. We develop CMPKB, a linear-programming based method to compile comparative relations, and, we develop PWKB, a method based on statistical and logical inference to compile part-whole relations.
3. Interactions between objects: acquisition of activities like drive a car, park a car, etc., with attributes such as temporal or spatial attributes. We develop Knowlywood, a method based on semantic parsing and probabilistic graphical models to compile activity knowledge.
Together, these methods result in the construction of a large, clean and semantically organized Commonsense Knowledge Base that we call WebChild KB.Von Computern wird immer mehr erwartet, dass sie kluge Entscheidungen treffen können, basierend auf Allgemeinwissen. Dies setzt voraus, dass Computer ihre Umgebung, einschlieĂlich der Eigenschaften von Objekten (z. B. das Rad ist rund), Beziehungen zwischen Objekten (z. B. ein Fahrrad hat zwei RĂ€der, ein Fahrrad ist langsamer als ein Auto) und Interaktionen von Objekten (z. B. ein Fahrer fĂ€hrt ein Auto auf der StraĂe), verstehen können.
Das Ziel dieser Dissertation ist es, automatische Methoden fĂŒr die Erfassung von groĂmaĂstĂ€blichem, semantisch organisiertem Allgemeinwissen zu schaffen. Dies ist schwierig aufgrund folgender Eigenschaften des Allgemeinwissens. Es ist: (i) implizit und spĂ€rlich, da Menschen nicht explizit das Offensichtliche ausdrĂŒcken, (ii) multimodal, da es ĂŒber textuelle und visuelle Inhalte verteilt ist, (iii) beeintrĂ€chtigt vom Einfluss des Berichtenden, da ungewöhnliche Fakten disproportional hĂ€ufig berichtet werden, (iv) KontextabhĂ€ngig, und hat aus diesem Grund eine eingeschrĂ€nkte statistische Konfidenz.
Vorherige Methoden, auf diesem Gebiet sind entweder nicht automatisiert oder basieren auf flachen ReprĂ€sentationen. Daher können sie kein groĂmaĂstĂ€bliches, semantisch organisiertes Allgemeinwissen erzeugen.
Um unser Ziel zu erreichen, teilen wir den Problemraum in drei Forschungsrichtungen, welche den Hauptbeitrag dieser Dissertation formen:
1. Eigenschaften von Objekten: Erfassung von Eigenschaften wie hasSize, hasShape, usw. Wir entwickeln WebChild, eine halbĂŒberwachte Methode zum Erfassen semantisch organisierter Eigenschaften.
2. Beziehungen zwischen Objekten: Erfassung von Beziehungen wie largerThan, partOf, memberOf, usw. Wir entwickeln CMPKB, eine Methode basierend auf linearer Programmierung um vergleichbare Beziehungen zu erfassen. Weiterhin entwickeln wir PWKB, eine Methode basierend auf statistischer und logischer Inferenz welche zugehörigkeits Beziehungen erfasst.
3. Interaktionen zwischen Objekten: Erfassung von AktivitÀten, wie drive a car, park a car, usw. mit temporalen und rÀumlichen Attributen. Wir entwickeln Knowlywood, eine Methode basierend auf semantischem Parsen und probabilistischen grafischen Modellen um AktivitÀtswissen zu erfassen.
Als Resultat dieser Methoden erstellen wir eine groĂe, saubere und semantisch organisierte Allgemeinwissensbasis, welche wir WebChild KB nennen
A Survey on Semantic Processing Techniques
Semantic processing is a fundamental research domain in computational
linguistics. In the era of powerful pre-trained language models and large
language models, the advancement of research in this domain appears to be
decelerating. However, the study of semantics is multi-dimensional in
linguistics. The research depth and breadth of computational semantic
processing can be largely improved with new technologies. In this survey, we
analyzed five semantic processing tasks, e.g., word sense disambiguation,
anaphora resolution, named entity recognition, concept extraction, and
subjectivity detection. We study relevant theoretical research in these fields,
advanced methods, and downstream applications. We connect the surveyed tasks
with downstream applications because this may inspire future scholars to fuse
these low-level semantic processing tasks with high-level natural language
processing tasks. The review of theoretical research may also inspire new tasks
and technologies in the semantic processing domain. Finally, we compare the
different semantic processing techniques and summarize their technical trends,
application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN
1566-2535. The equal contribution mark is missed in the published version due
to the publication policies. Please contact Prof. Erik Cambria for detail
Harnessing sense-level information for semantically augmented knowledge extraction
Nowadays, building accurate computational models for the semantics of language lies at the very core of Natural Language Processing and Artificial Intelligence. A first and foremost step in this respect consists in moving from word-based to sense-based approaches, in which operating explicitly at the level of word senses enables a model to produce more accurate and unambiguous results. At the same time, word senses create a bridge towards structured lexico-semantic resources, where the vast amount of available machine-readable information can help overcome the shortage of annotated data in many languages and domains of knowledge.
This latter phenomenon, known as the knowledge acquisition bottlneck, is a crucial problem that hampers the development of large-scale, data-driven approaches for many Natural Language Processing tasks, especially when lexical semantics is directly involved. One of these tasks is Information Extraction, where an effective model has to cope with data sparsity, as well as with lexical ambiguity that can arise at the level of both arguments and relational phrases. Even in more recent Information Extraction approaches where semantics is implicitly modeled, these issues have not yet been addressed in their entirety. On the other hand, however, having access to explicit sense-level information is a very demanding task on its own, which can rarely be performed with high accuracy on a large scale. With this in mind, in ths thesis we will tackle a two-fold objective: our first focus will be on studying fully automatic approaches to obtain high-quality sense-level information from textual corpora; then, we will investigate in depth where and how such sense-level information has the potential to enhance the extraction of knowledge from open text.
In the first part of this work, we will explore three different disambiguation scenar- ios (semi-structured text, parallel text, and definitional text) and devise automatic disambiguation strategies that are not only capable of scaling to different corpus sizes and different languages, but that actually take advantage of a multilingual and/or heterogeneous setting to improve and refine their performance. As a result, we will obtain three sense-annotated resources that, when tested experimentally with a baseline system in a series of downstream semantic tasks (i.e. Word Sense Disam- biguation, Entity Linking, Semantic Similarity), show very competitive performances on standard benchmarks against both manual and semi-automatic competitors.
In the second part we will instead focus on Information Extraction, with an emphasis on Open Information Extraction (OIE), where issues like sparsity and lexical ambiguity are especially critical, and study how to exploit at best sense-level information within the extraction process. We will start by showing that enforcing a deeper semantic analysis in a definitional setting enables a full-fledged extraction pipeline to compete with state-of-the-art approaches based on much larger (but noisier) data. We will then demonstrate how working at the sense level at the end of an extraction pipeline is also beneficial: indeed, by leveraging sense-based techniques, very heterogeneous OIE-derived data can be aligned semantically, and unified with respect to a common sense inventory. Finally, we will briefly shift the focus to the more constrained setting of hypernym discovery, and study a sense-aware supervised framework for the task that is robust and effective, even when trained on heterogeneous OIE-derived hypernymic knowledge
Aspects of Coherence for Entity Analysis
Natural language understanding is an important topic in natural language proces-
sing. Given a text, a computer program should, at the very least, be able to under-
stand what the text is about, and ideally also situate it in its extra-textual context
and understand what purpose it serves. What exactly it means to understand what a
text is about is an open question, but it is generally accepted that, at a minimum, un-
derstanding involves being able to answer questions like âWho did what to whom?
Where? When? How? And Why?â. Entity analysis, the computational analysis of
entities mentioned in a text, aims to support answering the questions âWho?â and
âWhom?â by identifying entities mentioned in a text. If the answers to âWhere?â
and âWhen?â are specific, named locations and events, entity analysis can also pro-
vide these answers. Entity analysis aims to answer these questions by performing
entity linking, that is, linking mentions of entities to their corresponding entry in
a knowledge base, coreference resolution, that is, identifying all mentions in a text
that refer to the same entity, and entity typing, that is, assigning a label such as
Person to mentions of entities.
In this thesis, we study how different aspects of coherence can be exploited to
improve entity analysis. Our main contribution is a method that allows exploiting
knowledge-rich, specific aspects of coherence, namely geographic, temporal, and
entity type coherence. Geographic coherence expresses the intuition that entities
mentioned in a text tend to be geographically close. Similarly, temporal coherence
captures the intuition that entities mentioned in a text tend to be close in the tem-
poral dimension. Entity type coherence is based in the observation that in a text
about a certain topic, such as sports, the entities mentioned in it tend to have the
same or related entity types, such as sports team or athlete. We show how to integrate
features modeling these aspects of coherence into entity linking systems and esta-
blish their utility in extensive experiments covering different datasets and systems.
Since entity linking often requires computationally expensive joint, global optimi-
zation, we propose a simple, but effective rule-based approach that enjoys some of
the benefits of joint, global approaches, while avoiding some of their drawbacks.
To enable convenient error analysis for system developers, we introduce a tool for
visual analysis of entity linking system output. Investigating another aspect of co-
herence, namely the coherence between a predicate and its arguments, we devise a
distributed model of selectional preferences and assess its impact on a neural core-
ference resolution system. Our final contribution examines how multilingual entity
typing can be improved by incorporating subword information. We train and make
publicly available subword embeddings in 275 languages and show their utility in
a multilingual entity typing tas
A computational approach to Latin verbs: new resources and methods
Questa tesi presenta l'applicazione di metodi computazionali allo studio dei verbi latini. In particolare, mostriamo la creazione di un lessico di sottocategorizzazione estratto automaticamente da corpora annotati; inoltre presentiamo un modello probabilistico per l'acquisizione di preferenze di selezione a partire da corpora annotati e da un'ontologia (Latin WordNet). Infine, descriviamo i risultati di uno studio diacronico e quantitativo sui preverbi spaziali latini