Search CORE

5,792 research outputs found

Recommended from our members

Representing lexical ambiguity in prototype models of lexical semantics

Author: Beekhuizen Barend
Cui Chen Xuan
Stevenson Suzanne
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

We show, contrary to some recent claims in the literature, thatprototype distributional semantic models (DSMs) are capa-ble of representing multiple senses of ambiguous words, in-cluding infrequent meanings. We propose that word2vec con-tains a natural, model-internal way of operationalizing the dis-ambiguation process by leveraging the two sets of represen-tations word2vec learns, instead of just one as most workon this model does. We evaluate our approach on artifi-cial language simulations where other prototype DSMs havebeen shown to fail. We furthermore assess whether these re-sults scale to the disambiguation of naturalistic corpus exam-ples. We do so by replacing all instances of sampled pairsof words in a corpus with pseudo-homonym tokens, and test-ing whether models, after being trained on one half of the cor-pus, were able to disambiguate pseudo-homonyms on the ba-sis of their linguistic contexts in the second half of the cor-pus. We observe that word2vec well surpasses the baselineof always guessing the most frequent meaning to be the rightone. Moreover, it degrades gracefully: As words are moreunbalanced, the baseline is higher, and it is harder to surpassit; nonetheless, Word2vec succeeds at surpassing the baseline,even for pseudo-homonyms whose most frequent meaning ismuch more frequent than the other

eScholarship - University of California

Having Your Cake and Eating It Too: Autonomy and Interaction in a Model of Sentence Processing

Author: Eiselt Kurt P.
Holbrook Jennifer K.
Mahesh Kavi
Publication venue
Publication date: 01/01/1993
Field of study

Is the human language understander a collection of modular processes operating with relative autonomy, or is it a single integrated process? This ongoing debate has polarized the language processing community, with two fundamentally different types of model posited, and with each camp concluding that the other is wrong. One camp puts forth a model with separate processors and distinct knowledge sources to explain one body of data, and the other proposes a model with a single processor and a homogeneous, monolithic knowledge source to explain the other body of data. In this paper we argue that a hybrid approach which combines a unified processor with separate knowledge sources provides an explanation of both bodies of data, and we demonstrate the feasibility of this approach with the computational model called COMPERE. We believe that this approach brings the language processing community significantly closer to offering human-like language processing systems.Comment: 7 pages, uses aaai.sty macr

arXiv.org e-Print Archive

CiteSeerX

From Word to Sense Embeddings: A Survey on Vector Representations of Meaning

Author: Camacho-Collados Jose
Pilehvar Mohammad Taher
Publication venue
Publication date: 26/10/2018
Field of study

Over the past years, distributed semantic representations have proved to be effective and flexible keepers of prior knowledge to be integrated into downstream applications. This survey focuses on the representation of meaning. We start from the theoretical background behind word vector space models and highlight one of their major limitations: the meaning conflation deficiency, which arises from representing a word with all its possible meanings as a single vector. Then, we explain how this deficiency can be addressed through a transition from the word level to the more fine-grained level of word senses (in its broader acceptation) as a method for modelling unambiguous lexical meaning. We present a comprehensive overview of the wide range of techniques in the two main branches of sense representation, i.e., unsupervised and knowledge-based. Finally, this survey covers the main evaluation procedures and applications for this type of representation, and provides an analysis of four of its important aspects: interpretability, sense granularity, adaptability to different domains and compositionality.Comment: 46 pages, 8 figures. Published in Journal of Artificial Intelligence Researc

arXiv.org e-Print Archive

Online Research @ Cardiff

A Rose is a Rose is a Rose

Author: Hüllen Werner
Lipka Leonhard
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/01/1988
Field of study

Open Access LMU

Robust Grammatical Analysis for Spoken Dialogue Systems

Author: Bouma Gosse
Koeling Rob
Nederhof Mark-Jan
van Noord Gertjan
Publication venue
Publication date: 01/01/1998
Field of study

We argue that grammatical analysis is a viable alternative to concept spotting for processing spoken input in a practical spoken dialogue system. We discuss the structure of the grammar, and a model for robust parsing which combines linguistic sources of information and statistical sources of information. We discuss test results suggesting that grammatical processing allows fast and accurate processing of spoken input.Comment: Accepted for JNL

arXiv.org e-Print Archive

CiteSeerX

Syntax-Aware Multi-Sense Word Embeddings for Deep Compositional Models of Meaning

Author: Cheng Jianpeng
Kartsaklis Dimitri
Publication venue
Publication date: 01/01/2015
Field of study

Deep compositional models of meaning acting on distributional representations of words in order to produce vectors of larger text constituents are evolving to a popular area of NLP research. We detail a compositional distributional framework based on a rich form of word embeddings that aims at facilitating the interactions between words in the context of a sentence. Embeddings and composition layers are jointly learned against a generic objective that enhances the vectors with syntactic information from the surrounding context. Furthermore, each word is associated with a number of senses, the most plausible of which is selected dynamically during the composition process. We evaluate the produced vectors qualitatively and quantitatively with positive results. At the sentence level, the effectiveness of the framework is demonstrated on the MSRPar task, for which we report results within the state-of-the-art range.Comment: Accepted for presentation at EMNLP 201

arXiv.org e-Print Archive

CiteSeerX

Crossref