7 research outputs found
A Systematic Analysis of Vocabulary and BPE Settings for Optimal Fine-tuning of NMT:A Case Study of In-domain Translation
The effectiveness of Neural Machine Translation (NMT) models largely depends on the vocabulary used at training; small vocabularies can lead to out-of-vocabulary problems -- large ones, to memory issues. Subword (SW) tokenization has been successfully employed to mitigate these issues. The choice of vocabulary and SW tokenization has a significant impact on both training and fine-tuning an NMT model. Fine-tuning is a common practice in optimizing an MT model with respect to new data. However, new data potentially introduces new words (or tokens), which, if not taken into consideration, may lead to suboptimal performance. In addition, the distribution of tokens in the new data can differ from the distribution of the original data. As such, the original SW tokenization model could be less suitable for the new data. Through a systematic empirical evaluation, in this work we compare different strategies for SW tokenization and vocabulary generation with the ultimate goal to uncover an optimal setting for fine-tuning a domain-specific model. Furthermore, we developed several (in-domain) models, the best of which achieves 6 BLEU points improvement over the baseline
Recommended from our members
Inference and learning in probabilistic logic programs using weighted Boolean formulas
Probabilistic logic programs are logic programs in which some of the facts are annotated with probabilities. This paper investigates how classical inference and learning tasks known from the graphical model community can be tackled for probabilistic logic programs. Several such tasks, such as computing the marginals, given evidence and learning from (partial) interpretations, have not really been addressed for probabilistic logic programs before. The first contribution of this paper is a suite of efficient algorithms for various inference tasks. It is based on the conversion of the program and the queries and evidence to a weighted Boolean formula. This allows us to reduce inference tasks to well-studied tasks, such as weighted model counting, which can be solved using state-of-the-art methods known from the graphical model and knowledge compilation literature. The second contribution is an algorithm for parameter estimation in the learning from interpretations setting. The algorithm employs expectation-maximization, and is built on top of the developed inference algorithms. The proposed approach is experimentally evaluated. The results show that the inference algorithms improve upon the state of the art in probabilistic logic programming, and that it is indeed possible to learn the parameters of a probabilistic logic program from interpretations
Inference and learning in probabilistic logic programs using weighted Boolean formulas
Probabilistic logic programs are logic programs in which some of the facts are annotated with probabilities. This paper investigates how classical inference and learning tasks known from the graphical model community can be tackled for probabilistic logic programs. Several such tasks, such as computing the marginals, given evidence and learning from (partial) interpretations, have not really been addressed for probabilistic logic programs before. The first contribution of this paper is a suite of efficient algorithms for various inference tasks. It is based on the conversion of the program and the queries and evidence to a weighted Boolean formula. This allows us to reduce inference tasks to well-studied tasks, such as weighted model counting, which can be solved using state-of-the-art methods known from the graphical model and knowledge compilation literature. The second contribution is an algorithm for parameter estimation in the learning from interpretations setting. The algorithm employs expectation-maximization, and is built on top of the developed inference algorithms. The proposed approach is experimentally evaluated. The results show that the inference algorithms improve upon the state of the art in probabilistic logic programming, and that it is indeed possible to learn the parameters of a probabilistic logic program from interpretations
First Steps Towards a Signing Avatar for Railway Travel Announcements in the Netherlands
This paper presents first steps towards a sign language avatar for communicating railway travel announcements in Dutch Sign Language. Taking an interdisciplinary approach, it demonstrates effective ways to employ co-design and focus group methods in the context of developing sign language technology, and presents several concrete findings and results obtained through co-design and focus group sessions which have not only led to improvements of our own prototype but may also inform the development of signing avatars for other languages and in other application domains
First Steps Towards a Signing Avatar for Railway Travel Announcements in the Netherlands
This paper presents first steps towards a sign language avatar for communicating railway travel announcements in Dutch Sign Language. Taking an interdisciplinary approach, it demonstrates effective ways to employ co-design and focus group methods in the context of developing sign language technology, and presents several concrete findings and results obtained through co-design and focus group sessions which have not only led to improvements of our own prototype but may also inform the development of signing avatars for other languages and in other application domains