Search CORE

824 research outputs found

A Unified multilingual semantic representation of concepts

Author: CAMACHO COLLADOS Jose'
Navigli Roberto
Pilehvar MOHAMMED TAHER
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

Semantic representation lies at the core of several applications in Natural Language Processing. However, most existing semantic representation techniques cannot be used effectively for the representation of individual word senses. We put forward a novel multilingual concept representation, called MUFFIN , which not only enables accurate representation of word senses in different languages, but also provides multiple advantages over existing approaches. MUFFIN represents a given concept in a unified semantic space irrespective of the language of interest, enabling cross-lingual comparison of different concepts. We evaluate our approach in two different evaluation benchmarks, semantic similarity and Word Sense Disambiguation, reporting state-of-the-art performance on several standard datasets

CiteSeerX

Crossref

Online Research @ Cardiff

Archivio della ricerca- Università di Roma La Sapienza

A Simple and Effective Method of Cross-Lingual Plagiarism Detection

Author: Avetisyan Arutyun
Avetisyan Karen
Ghukasyan Tsolak
Malajyan Arthur
Publication venue
Publication date: 05/04/2023
Field of study

We present a simple cross-lingual plagiarism detection method applicable to a large number of languages. The presented approach leverages open multilingual thesauri for candidate retrieval task and pre-trained multilingual BERT-based language models for detailed analysis. The method does not rely on machine translation and word sense disambiguation when in use, and therefore is suitable for a large number of languages, including under-resourced languages. The effectiveness of the proposed approach is demonstrated for several existing and new benchmarks, achieving state-of-the-art results for French, Russian, and Armenian languages

arXiv.org e-Print Archive

Recent Trends in Word Sense Disambiguation : A Survey

Author: Bevilacqua Michele
Navigli Roberto
Pasini Tommaso
Raganato Alessandro
Publication venue: International Joint Conference on Artificial Intelligence, Inc
Publication date: 01/01/2021
Field of study

Survey TrackPeer reviewe

Helsingin yliopiston digitaalinen arkisto

Archivio della ricerca- Università di Roma La Sapienza

Learning Word Sense Representations from Neural Language Models

Author: Daniel Alexandre Bouçanova Loureiro
Publication venue
Publication date: 19/01/2023
Field of study

Repositório Aberto da Universidade do Porto

Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations

Author: Conia Simone
Navigli Roberto
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

To date, the most successful word, word sense, and concept modelling techniques have used large corpora and knowledge resources to produce dense vector representations that capture semantic similarities in a relatively low-dimensional space. Most current approaches, however, suffer from a monolingual bias, with their strength depending on the amount of data available across languages. In this paper we address this issue and propose Conception, a novel technique for building language-independent vector representations of concepts which places multilinguality at its core while retaining explicit relationships between concepts. Our approach results in high-coverage representations that outperform the state of the art in multilingual and cross-lingual Semantic Word Similarity and Word Sense Disambiguation, proving particularly robust on low-resource languages. Conception – its software and the complete set of representations – is available at https://github.com/SapienzaNLP/conception

Crossref

Archivio della ricerca- Università di Roma La Sapienza

NASARI: a novel approach to a Semantically-Aware Representation of items

Author: CAMACHO COLLADOS Jose'
Navigli Roberto
Pilehvar MOHAMMED TAHER
Publication venue
Publication date: 01/01/2015
Field of study

The semantic representation of individual word senses and concepts is of fundamental importance to several applications in Natural Language Processing. To date, concept modeling techniques have in the main based their representation either on lexicographic resources, such as WordNet, or on encyclopedic resources, such as Wikipedia. We propose a vector representation technique that combines the complementary knowledge of both these types of resource. Thanks to its use of explicit semantics combined with a novel cluster-based dimensionality reduction and an effective weighting scheme, our representation attains state-of-the-art performance on multiple datasets in two standard benchmarks: word similarity and sense clustering. We are releasing our vector representations at http://lcl.uniroma1.it/nasari/

CiteSeerX

Crossref

Online Research @ Cardiff

Archivio della ricerca- Università di Roma La Sapienza

Embedding Words and Senses Together via Joint Knowledge-Enhanced Training

Author: CAMACHO COLLADOS Jose'
Iacobacci IGNACIO JAVIER
Mancini Massimiliano
Navigli Roberto
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Word embeddings are widely used in Nat-ural Language Processing, mainly due totheir success in capturing semantic infor-mation from massive corpora. However,their creation process does not allow thedifferent meanings of a word to be auto-matically separated, as it conflates theminto a single vector. We address this issueby proposing a new model which learnsword and sense embeddings jointly. Ourmodel exploits large corpora and knowl-edge from semantic networks in order toproduce a unified vector space of wordand sense embeddings. We evaluate themain features of our approach both qual-itatively and quantitatively in a variety oftasks, highlighting the advantages of theproposed method in comparison to state-of-the-art word- and sense-based models

arXiv.org e-Print Archive

Crossref

Online Research @ Cardiff

Archivio della ricerca- Università di Roma La Sapienza