4 research outputs found

    Experiments on the difference between semantic similarity and relatedness

    Get PDF
    Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kristiina Jokinen and Eckhard Bick. NEALT Proceedings Series, Vol. 4 (2009), 81-88. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9206

    Mannen är faderns mormor: Svenskt associationslexikon reinkarnerat

    Get PDF
    Svenskt associationslexikon (SAL; Lönngren 1992) is relatively new and relativelylittle known Swedish thesaurus, compiled on the basis of corpora and some existingSwedish monolingual lexical resources. SAL is organized as a strict lexical-semantichierarchy originating in an artificial top lexeme, where (primary) vertical ’parent’–’child’ relations point to less central, but semantically closely related lexemes, and(secondary) horizontal ’sibling’ relations form thesaurus-like word families. Thispaper describes my work – in collaboration with the author of SAL, Lennart Lönngren– on making an electronic version of the full SAL (comprising 71752 entries)publicly available through Språkbanken at Göteborg University in a modern, standardizedformat, which has been thoroughly checked for formal errors, enhancedwith additional information about lemmas and inflectional paradigms of entries, andmade browsable with a web-based graphical interface capable of displaying andnavigating the network structure of SAL

    SUBJECTIVITY WORD SENSE DISAMBIGUATION: A METHOD FOR SENSE-AWARE SUBJECTIVITY ANALYSIS

    Get PDF
    Subjectivity lexicons have been invaluable resources in subjectivity analysis and their creation has been an important topic. Many systems rely on these lexicons. For any subjectivity analysis system, which relies on a subjectivity lexicon, subjectivity sense ambiguity is a serious problem. Such systems will be misled by the presence of subjectivity clues used with objective senses called false hits. We believe that any type of subjectivity analysis system relying on lexicons will benefit from a sense-aware approach. We think sense-aware subjectivity analysis has been neglected mostly because of the concerns related to word sense disambiguation (WSD), the problem of automatically determining which sense of a word is activated by the use of the word in a particular context according to a sense-inventory. Although WSD is the perfect tool for sense-aware classification, trust in traditional fine-grained WSD as an enabling technology is not high due to previous mostly unsuccessful results. In this thesis, we investigate feasible and practical methods to avoid these false hits via sense-aware analysis. We define a new coarse-grained WSD task capturing the right semantic granularity specific to subjectivity analysis

    Computational models for semantic textual similarity

    Get PDF
    164 p.The overarching goal of this thesis is to advance on computational models of meaning and their evaluation. To achieve this goal we define two tasks and develop state-of-the-art systems that tackle both task: Semantic Textual Similarity (STS) and Typed Similarity.STS aims to measure the degree of semantic equivalence between two sentences by assigning graded similarity values that capture the intermediate shades of similarity. We have collected pairs of sentences to construct datasets for STS, a total of 15,436 pairs of sentences, being by far the largest collection of data for STS.We have designed, constructed and evaluated a new approach to combine knowledge-based and corpus-based methods using a cube. This new system for STS is on par with state-of-the-art approaches that make use of Machine Learning (ML) without using any of it, but ML can be used on this system, improving the results.Typed Similarity tries to identify the type of relation that holds between a pair of similar items in a digital library. Providing a reason why items are similar has applications in recommendation, personalization, and search. A range of types of similarity in this collection were identified and a set of 1,500 pairs of items from the collection were annotated using crowdsourcing.Finally, we present systems capable of resolving the Typed Similarity task. The best system resulted in a real-world application to recommend similar items to users in an online digital library
    corecore