364 research outputs found

    Resolving Regular Polysemy in Named Entities

    Full text link
    Word sense disambiguation primarily addresses the lexical ambiguity of common words based on a predefined sense inventory. Conversely, proper names are usually considered to denote an ad-hoc real-world referent. Once the reference is decided, the ambiguity is purportedly resolved. However, proper names also exhibit ambiguities through appellativization, i.e., they act like common words and may denote different aspects of their referents. We proposed to address the ambiguities of proper names through the light of regular polysemy, which we formalized as dot objects. This paper introduces a combined word sense disambiguation (WSD) model for disambiguating common words against Chinese Wordnet (CWN) and proper names as dot objects. The model leverages the flexibility of a gloss-based model architecture, which takes advantage of the glosses and example sentences of CWN. We show that the model achieves competitive results on both common and proper nouns, even on a relatively sparse sense dataset. Aside from being a performant WSD tool, the model further facilitates the future development of the lexical resource

    Resolving XML Semantic Ambiguity

    Get PDF
    ABSTRACT XML semantic-aware processing has become a motivating and important challenge in Web data management, data processing, and information retrieval. While XML data is semi-structured, yet it remains prone to lexical ambiguity, and thus requires dedicated semantic analysis and sense disambiguation processes to assign well-defined meaning to XML elements and attributes. This becomes crucial in an array of applications ranging over semantic-aware query rewriting, semantic document clustering and classification, schema matching, as well as blog analysis and event detection in social networks and tweets. Most existing approaches in this context: i) ignore the problem of identifying ambiguous XML nodes, ii) only partially consider their structural relations/context, iii) use syntactic information in processing XML data regardless of the semantics involved, and iv) are static in adopting fixed disambiguation constraints thus limiting user involvement. In this paper, we provide a new XML Semantic Disambiguation Framework titled XSDF designed to address each of the above motivations, taking as input: an XML document and a general purpose semantic network, and then producing as output a semantically augmented XML tree made of unambiguous semantic concepts. Experiments demonstrate the effectiveness of our approach in comparison with alternative methods. Categories and Subject Descriptors General Terms Algorithms, Measurement, Performance, Design, Experimentation. Keywords XML semantic-aware processing, a m b i g u i t y d e g r e e , s p h e r e neighborhood, XML context vector, semantic network, semantic disambiguation

    One emoji, many meanings: A corpus for the prediction and disambiguation of emoji sense

    Get PDF
    In this work, we uncover a hidden linguistic property of emoji, namely that they are polysemous and can be used to form a semantic network of emoji meanings. Our key contributions to this direction of study are as follows: (1) We have developed a new corpus to help in the task of emoji sense prediction. This corpus contains tweets with single emojis, where each emoji has been labelled with an appropriate sense identifier from WordNet. (2) Experiments, which demonstrate that it is possible to predict the sense of an emoji using our corpus to a reasonable level of accuracy. We are able to report an average path-similarity score of 0.4146 for our best emoji sense prediction algorithm. (3) We further show that emoji sense is a useful feature in the emoji prediction task, where we report an accuracy of 58.8816 and macro-F1 score of 46.6640, beating reasonable baselines in this task. Our work demonstrates that importance of considering the meaning behind emoji, rather than ignoring them, or simply treating them as extra wordforms
    • …
    corecore