6 research outputs found

    Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution

    Full text link
    Lexical substitution, i.e. generation of plausible words that can replace a particular target word in a given context, is an extremely powerful technology that can be used as a backbone of various NLP applications, including word sense induction and disambiguation, lexical relation extraction, data augmentation, etc. In this paper, we present a large-scale comparative study of lexical substitution methods employing both rather old and most recent language and masked language models (LMs and MLMs), such as context2vec, ELMo, BERT, RoBERTa, XLNet. We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly. Several existing and new target word injection methods are compared for each LM/MLM using both intrinsic evaluation on lexical substitution datasets and extrinsic evaluation on word sense induction (WSI) datasets. On two WSI datasets we obtain new SOTA results. Besides, we analyze the types of semantic relations between target words and their substitutes generated by different models or given by annotators.Comment: arXiv admin note: text overlap with arXiv:2006.0003

    SUBSTITUTION ON THE FAMOUS INDONESIAN NOVEL “AYAT AYAT CINTA” WRITTEN BY HABIBURRAHMAN SAEROZI

    Get PDF
    After being explained about substitution by one of my lecturers, professor Sumarlam, I am interested in investigating the substitution on a famous novel. I choose “Ayat Ayat Cinta” a novel written by Habiburrahman Saerozi. It is very famous in Indonesia and it is not only interesting to read but also it gives us education and the way to love someone seen from Moslem side.  There are so many some moral values given there. The novel tells us about true love, honesty, pride, and struggle. The data which I analyzed was only in the sub title “Gadis Mesir Itu Bernama Maria”. The aim of this study is to describe the substitution used by the great writer. I am definitely sure that it will give good contribution on understanding about substitution. The ground theory which I use is from (Sudaryanto, 1993:15). He stated that there were four kinds of substitution, such as: equal quality substitution, definite readdressing substitution, nominal predicative substitution, pronominal substitution. I analyzed the data by describing the substitution used based on substitution theory. There are four findings which I can get after investigating the substitution used in the novel, namely: 1) there is no nominal predicative substitutions but the pronominal substitution is mostly used, 2) one word can substitute two words, 3) the writer differs the calling between friends and teachers to show respect, 4) aku (I) as subject sometimes are not fully written in Indonesian.

    Language Transfer Learning for Supervised Lexical Substitution

    No full text
    We propose a framework for lexical substitution that is able to perform transfer learning across languages. Datasets for this task are available in at least three languages (English, Italian, and German). Previous work has addressed each of these tasks in isolation. In contrast, we regard the union of three shared tasks as a combined multilingual dataset. We show that a supervised system can be trained effectively, even if training and evaluation data are from different languages. Successful transfer learning between languages suggests that the learned model is in fact independent of the underlying language. We combine state-of-the-art unsupervised features obtained from syntactic word embeddings and distributional thesauri in a supervised delexicalized ranking system. Our system improves over state of the art in the full lexical substitution task in all three languages
    corecore