Search CORE

41 research outputs found

Evaluating multi-sense embeddings for semantic resolution monolingually and in word translation

Author: Borbély Gábor
Kornai András
Makrai Márton
Nemeskey Dávid Márk
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

SZTAKI Publication Repository

Repository of the Academy's Library

Exploiting Cross-Lingual Representations For Natural Language Processing

Author: Upadhyay Shyam
Publication venue: ScholarlyCommons
Publication date: 01/01/2019
Field of study

Traditional approaches to supervised learning require a generous amount of labeled data for good generalization. While such annotation-heavy approaches have proven useful for some Natural Language Processing (NLP) tasks in high-resource languages (like English), they are unlikely to scale to languages where collecting labeled data is di cult and time-consuming. Translating supervision available in English is also not a viable solution, because developing a good machine translation system requires expensive to annotate resources which are not available for most languages. In this thesis, I argue that cross-lingual representations are an effective means of extending NLP tools to languages beyond English without resorting to generous amounts of annotated data or expensive machine translation. These representations can be learned in an inexpensive manner, often from signals completely unrelated to the task of interest. I begin with a review of different ways of inducing such representations using a variety of cross-lingual signals and study algorithmic approaches of using them in a diverse set of downstream tasks. Examples of such tasks covered in this thesis include learning representations to transfer a trained model across languages for document classification, assist in monolingual lexical semantics like word sense induction, identify asymmetric lexical relationships like hypernymy between words in different languages, or combining supervision across languages through a shared feature space for cross-lingual entity linking. In all these applications, the representations make information expressed in other languages available in English, while requiring minimal additional supervision in the language of interest

ScholarlyCommons@Penn

Bilingual learning of multi-sense embeddings with discrete autoencoders

Author: Titov I.
van Noord G.
Šuster S.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

International Migration, Integration and Social Cohesion online publications

Az EFNILEX és egy fiatal kutató. Hat év magyar szóbeágyazásokkal

Author: Makrai Márton
Publication venue: Nyelvtudományi Kutatóközpont
Publication date: 01/01/2021
Field of study

Repository of the Academy's Library

SimLex-999: Evaluating Semantic Models with (Genuine) Similarity Estimation

Author: Hill Felix
Korhonen Anna
Reichart Roi
Publication venue
Publication date: 14/08/2014
Field of study

We present SimLex-999, a gold standard resource for evaluating distributional semantic models that improves on existing resources in several important ways. First, in contrast to gold standards such as WordSim-353 and MEN, it explicitly quantifies similarity rather than association or relatedness, so that pairs of entities that are associated but not actually similar [Freud, psychology] have a low rating. We show that, via this focus on similarity, SimLex-999 incentivizes the development of models with a different, and arguably wider range of applications than those which reflect conceptual association. Second, SimLex-999 contains a range of concrete and abstract adjective, noun and verb pairs, together with an independent rating of concreteness and (free) association strength for each pair. This diversity enables fine-grained analyses of the performance of models on concepts of different types, and consequently greater insight into how architectures can be improved. Further, unlike existing gold standard evaluations, for which automatic approaches have reached or surpassed the inter-annotator agreement ceiling, state-of-the-art models perform well below this ceiling on SimLex-999. There is therefore plenty of scope for SimLex-999 to quantify future improvements to distributional semantic models, guiding the development of the next generation of representation-learning architectures

arXiv.org e-Print Archive

CiteSeerX

Do multi-sense word embeddings learn more senses?

Author: Makrai Márton
Publication venue: Research Institute for Linguistics, Hungarian Academy of Sciences (RIL HAS)
Publication date: 01/01/2017
Field of study

Repository of the Academy's Library

Empirical studies on word representations

Author: Suster Simon
Publication venue: Rijksuniversiteit Groningen
Publication date: 01/01/2016
Field of study

Dissertations of the University of Groningen

Empirical studies on word representations

Author: Suster Simon
Publication venue: Rijksuniversiteit Groningen
Publication date: 01/01/2016
Field of study

ARTS repository - University of Groningen

Bilingual Learning of Multi-sense Embeddings with Discrete Autoencoders

Author: Knight K.
Nenkova A.
Rambow O.
Titov I.
van Noord G.
Šuster S.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

We present an approach to learning multi-sense word embeddings relying both on monolingual and bilingual information. Our model consists of an encoder, which uses monolingual and bilingual context (i.e. a parallel sentence) to choose a sense for a given word, and a decoder which predicts context words based on the chosen sense. The two components are estimated jointly. We observe that the word representations induced from bilingual data outperform the monolingual counterparts across a range of evaluation tasks, even though crosslingual information is not available at test time

University of Groningen

Edinburgh Research Explorer

UvA-DARE

International Migration, Integration and Social Cohesion online publications

arXiv.org e-Print Archive

Crossref

Proceedings - University of Groningen

ARTS repository - University of Groningen

Institutional Repository Universiteit Antwerpen

Dissertations of the University of Groningen