Search CORE

4 research outputs found

Clustering of Russian Adjective-Noun Constructions using Word Embeddings

Author: Kutuzov Andrey
Kuzmenko Elizaveta
Pivovarova Lidia
Publication venue: The Association for Computational Linguistics
Publication date: 01/01/2017
Field of study

Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

NORA - Norwegian Open Research Archives

Russian word sense induction by clustering averaged word embeddings

Author: Kutuzov Andrey
Publication venue
Publication date: 01/01/2018
Field of study

The paper reports our participation in the shared task on word sense induction and disambiguation for the Russian language (RUSSE-2018). Our team was ranked 2nd for the wiki-wiki dataset (containing mostly homonyms) and 5th for the bts-rnc and active-dict datasets (containing mostly polysemous words) among all 19 participants. The method we employed was extremely naive. It implied representing contexts of ambiguous words as averaged word embedding vectors, using off-the-shelf pre-trained distributional models. Then, these vector representations were clustered with mainstream clustering techniques, thus producing the groups corresponding to the ambiguous word senses. As a side result, we show that word embedding models trained on small but balanced corpora can be superior to those trained on large but noisy data - not only in intrinsic evaluation, but also in downstream tasks like word sense induction.Comment: Proceedings of the 24rd International Conference on Computational Linguistics and Intellectual Technologies (Dialogue-2018

arXiv.org e-Print Archive

NORA - Norwegian Open Research Archives

Clustering Ideological Terms in Historical Newspaper Data with Diachronic Word Embeddings

Author: Kurunmäki Jussi
Marjanen Jani
Pivovarova Lidia
Zosa Elaine
Publication venue: Rheinisch-Westfaelische Technische Hochschule Aachen
Publication date: 01/01/2019
Field of study

Peer reviewe

TamPub Julkaisuarkisto - TamPub Institutional Repository

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Helsingin yliopiston digitaalinen arkisto

Trepo - Institutional Repository of Tampere University

Clustering ideological terms in historical newspaper data with diachronic word embeddings

Author: Kurunmäki Jussi
Marjanen Jani
Pivovarova Lidia
Zosa Elaine
Publication venue: CEUR-WS
Publication date: 13/11/2019
Field of study

Trepo - Institutional Repository of Tampere University