Search CORE

3,806 research outputs found

Combined optimization of feature selection and algorithm parameters in machine learning of language

Author: Daelemans Walter
De Meulder Fien
Hoste Veronique
Naudts Bart
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Comparative machine learning experiments have become an important methodology in empirical approaches to natural language processing (i) to investigate which machine learning algorithms have the 'right bias' to solve specific natural language processing tasks, and (ii) to investigate which sources of information add to accuracy in a learning approach. Using automatic word sense disambiguation as an example task, we show that with the methodology currently used in comparative machine learning experiments, the results may often not be reliable because of the role of and interaction between feature selection and algorithm parameter optimization. We propose genetic algorithms as a practical approach to achieve both higher accuracy within a single approach, and more reliable comparisons

CiteSeerX

Ghent University Academic Bibliography

Examining the validity of cross-lingual word sense disambiguation

Author: Hoste Veronique
Lefever Els
Publication venue
Publication date: 01/01/2011
Field of study

Ghent University Academic Bibliography

Learning to Learn to Disambiguate: Meta-Learning for Few-Shot Word Sense Disambiguation

Author: Holla Nithin
Mishra Pushkar
Shutova Ekaterina
Yannakoudakis Helen
Publication venue
Publication date: 01/01/2020
Field of study

The success of deep learning methods hinges on the availability of large training datasets annotated for the task of interest. In contrast to human intelligence, these methods lack versatility and struggle to learn and adapt quickly to new tasks, where labeled data is scarce. Meta-learning aims to solve this problem by training a model on a large number of few-shot tasks, with an objective to learn new tasks quickly from a small number of examples. In this paper, we propose a meta-learning framework for few-shot word sense disambiguation (WSD), where the goal is to learn to disambiguate unseen words from only a few labeled instances. Meta-learning approaches have so far been typically tested in an

N

-way,

K

-shot classification setting where each task has

N

classes with

K

examples per class. Owing to its nature, WSD deviates from this controlled setup and requires the models to handle a large number of highly unbalanced classes. We extend several popular meta-learning approaches to this scenario, and analyze their strengths and weaknesses in this new challenging setting.Comment: Added additional experiment

arXiv.org e-Print Archive

Crossref

King's Research Portal

International Migration, Integration and Social Cohesion online publications

UvA-DARE

GAMBL, genetic algorithm optimization of memory-based WSD

Author: Antal Van Den Bosch
Bart Decadt
Bart Decadt And
Véronique Hoste
Walter Daelemans
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2004
Field of study

GAMBL is a word expert approach to WSD in which each word expert is trained using memory based learning. Joint feature selection and algorithm parameter optimization are achieved with a genetic algorithm (GA). We use a cascaded classifier approach in which the GA optimizes local context features and the output of a separate keyword classifier (rather than also optimizing the keyword features together with the local context features). A further innovation on earlier versions of memory based WSD is the use of grammatical relation and chunk features. This paper presents the architecture of the system briefly, and discusses its performance on the English lexical sample and all words tasks in SENSEVAL-3

CiteSeerX

Ghent University Academic Bibliography

Tilburg University Repository

Comparing learning approaches to coreference resolution : there is more to it than 'bias'

Author: Daelemans Walter
Hoste Veronique
Publication venue
Publication date: 01/01/2005
Field of study

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Word sense disambiguation criteria: a systematic study

Author: Audibert Laurent
Publication venue
Publication date: 01/01/2004
Field of study

This article describes the results of a systematic in-depth study of the criteria used for word sense disambiguation. Our study is based on 60 target words: 20 nouns, 20 adjectives and 20 verbs. Our results are not always in line with some practices in the field. For example, we show that omitting non-content words decreases performance and that bigrams yield better results than unigrams

arXiv.org e-Print Archive

CiteSeerX

HAL AMU

Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation

Author: C. Fellbaum
D. Yarowsky
Lucia Specia
M. Stevenson
Maria das Graças Volpe Nunes
Mark Stevenson
S. Muggleton
S. Muggleton
S. Muggleton
S. Muggleton
Y. Wilks
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2010
Field of study

Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources

Crossref

White Rose Research Online

ShotgunWSD: An unsupervised algorithm for global word sense disambiguation inspired by DNA sequencing

Author: Butnaru Andrei M.
Hristea Florentina
Ionescu Radu Tudor
Publication venue
Publication date: 01/01/2017
Field of study

In this paper, we present a novel unsupervised algorithm for word sense disambiguation (WSD) at the document level. Our algorithm is inspired by a widely-used approach in the field of genetics for whole genome sequencing, known as the Shotgun sequencing technique. The proposed WSD algorithm is based on three main steps. First, a brute-force WSD algorithm is applied to short context windows (up to 10 words) selected from the document in order to generate a short list of likely sense configurations for each window. In the second step, these local sense configurations are assembled into longer composite configurations based on suffix and prefix matching. The resulted configurations are ranked by their length, and the sense of each word is chosen based on a voting scheme that considers only the top k configurations in which the word appears. We compare our algorithm with other state-of-the-art unsupervised WSD algorithms and demonstrate better performance, sometimes by a very large margin. We also show that our algorithm can yield better performance than the Most Common Sense (MCS) baseline on one data set. Moreover, our algorithm has a very small number of parameters, is robust to parameter tuning, and, unlike other bio-inspired methods, it gives a deterministic solution (it does not involve random choices).Comment: In Proceedings of EACL 201

arXiv.org e-Print Archive

Crossref