15 research outputs found

    A Game-Theoretic Approach to Word Sense Disambiguation

    Get PDF
    This article presents a new model for word sense disambiguation formulated in terms of evolutionary game theory, where each word to be disambiguated is represented as a node on a graph whose edges represent word relations and senses are represented as classes. The words simultaneously update their class membership preferences according to the senses that neighboring words are likely to choose. We use distributional information to weigh the influence that each word has on the decisions of the others and semantic similarity information to measure the strength of compatibility among the choices. With this information we can formulate the word sense disambiguation problem as a constraint satisfaction problem and solve it using tools derived from game theory, maintaining the textual coherence. The model is based on two ideas: Similar words should be assigned to similar classes and the meaning of a word does not depend on all the words in a text but just on some of them. The article provides an in-depth motivation of the idea of modeling the word sense disambiguation problem in terms of game theory, which is illustrated by an example. The conclusion presents an extensive analysis on the combination of similarity measures to use in the framework and a comparison with state-of-the-art systems. The results show that our model outperforms state-of-the-art algorithms and can be applied to different tasks and in different scenarios

    Word Sense Disambiguation with LSTM: Do We Really Need 100 Billion Words?

    Full text link
    Recently, Yuan et al. (2016) have shown the effectiveness of using Long Short-Term Memory (LSTM) for performing Word Sense Disambiguation (WSD). Their proposed technique outperformed the previous state-of-the-art with several benchmarks, but neither the training data nor the source code was released. This paper presents the results of a reproduction study of this technique using only openly available datasets (GigaWord, SemCore, OMSTI) and software (TensorFlow). From them, it emerged that state-of-the-art results can be obtained with much less data than hinted by Yuan et al. All code and trained models are made freely available

    CLiC-it 2017: A Retrospective

    Get PDF
    The Fourth Italian Conference on Computational Linguistics (CLiC-it 2017) took place in Rome, in December 2017. As in previous editions, it served as the prime forum in Italy for discussing research in computational linguistics and Natural Language Processing. As General Chairs, we offer a retrospective over the conference, highlighting its international flavour and its attention to students and young researchers, with a particular focus on the innovations that were introduced at the 2017 edition

    From Word to Sense Embeddings: A Survey on Vector Representations of Meaning

    Get PDF
    Over the past years, distributed semantic representations have proved to be effective and flexible keepers of prior knowledge to be integrated into downstream applications. This survey focuses on the representation of meaning. We start from the theoretical background behind word vector space models and highlight one of their major limitations: the meaning conflation deficiency, which arises from representing a word with all its possible meanings as a single vector. Then, we explain how this deficiency can be addressed through a transition from the word level to the more fine-grained level of word senses (in its broader acceptation) as a method for modelling unambiguous lexical meaning. We present a comprehensive overview of the wide range of techniques in the two main branches of sense representation, i.e., unsupervised and knowledge-based. Finally, this survey covers the main evaluation procedures and applications for this type of representation, and provides an analysis of four of its important aspects: interpretability, sense granularity, adaptability to different domains and compositionality.Comment: 46 pages, 8 figures. Published in Journal of Artificial Intelligence Researc

    Non-invasive study of natural dyes on historical textiles from the collection of Michelangelo Guggenheim

    Get PDF
    A selection of historical textile fragments from the Venetian art dealer Moisè Michelangelo Guggenheim collection, ranging from XV to XVIII century, has been investigated by means of non-invasive techniques in order to reveal the coloring materials. Imaging was preliminarily used to visually investigate the selected artwork fragments in order to investigate their structure and conservation conditions; Fiber Optics Reflectance Spectroscopy (FORS) allowed recognizing the main natural dyestuffs, such as indigotin and anthraquinones-based compounds, except the yellow ones, difficultly identifiable when using this non-invasive technique. Collected spectroscopic data have been also elaborated by using a clustering algorithm that permitted to group collected spectra on the basis of similar properties and evidencing their inflection point wavelength as the most influencing feature
    corecore