244 research outputs found
Cross-lingual alignments of ELMo contextual embeddings
Building machine learning prediction models for a specific NLP task requires
sufficient training data, which can be difficult to obtain for less-resourced
languages. Cross-lingual embeddings map word embeddings from a less-resourced
language to a resource-rich language so that a prediction model trained on data
from the resource-rich language can also be used in the less-resourced
language. To produce cross-lingual mappings of recent contextual embeddings,
anchor points between the embedding spaces have to be words in the same
context. We address this issue with a novel method for creating cross-lingual
contextual alignment datasets. Based on that, we propose several cross-lingual
mapping methods for ELMo embeddings. The proposed linear mapping methods use
existing Vecmap and MUSE alignments on contextual ELMo embeddings. Novel
nonlinear ELMoGAN mapping methods are based on GANs and do not assume
isomorphic embedding spaces. We evaluate the proposed mapping methods on nine
languages, using four downstream tasks: named entity recognition (NER),
dependency parsing (DP), terminology alignment, and sentiment analysis. The
ELMoGAN methods perform very well on the NER and terminology alignment tasks,
with a lower cross-lingual loss for NER compared to the direct training on some
languages. In DP and sentiment analysis, linear contextual alignment variants
are more successful.Comment: 30 pages, 5 figure
FinEst BERT and CroSloEngual BERT: less is more in multilingual models
Large pretrained masked language models have become state-of-the-art
solutions for many NLP problems. The research has been mostly focused on
English language, though. While massively multilingual models exist, studies
have shown that monolingual models produce much better results. We train two
trilingual BERT-like models, one for Finnish, Estonian, and English, the other
for Croatian, Slovenian, and English. We evaluate their performance on several
downstream tasks, NER, POS-tagging, and dependency parsing, using the
multilingual BERT and XLM-R as baselines. The newly created FinEst BERT and
CroSloEngual BERT improve the results on all tasks in most monolingual and
cross-lingual situationsComment: 10 pages, accepted at TSD 2020 conferenc
Exploring the Relations Between Net Benefits of IT Projects and CIOs’ Perception of Quality of Software Development Disciplines
Software development enterprises are under consistent pressure to improve their management techniques and development processes. These are comprised of several software development methodology (SDM) disciplines such as requirements acquisition, design, coding, testing, etc. that must be continuously improved and individually tailored to suit specific software development projects. The paper proposes a methodology that enables the identification of SDM discipline quality categories and the evaluation of SDM disciplines’ net benefits. It advances the evaluation of software process quality from single quality category evaluation to multiple quality categories evaluation as proposed by the Kano model. An exploratory study was conducted to test the proposed methodology. The exploratory study results show that different types of Kano quality are present in individual SDM disciplines and that applications of individual SDM disciplines vary considerably in their relation to net benefits of IT projects. Consequently, software process quality evaluation models should start evaluating multiple categories of quality instead of just one and should not assume that the application of every individual SDM discipline has the same effect on the enterprise’s net benefits
Number of Instances for Reliable Feature Ranking in a Given Problem
Background: In practical use of machine learning models, users may add new features to an existing classification model, reflecting their (changed) empirical understanding of a field. New features potentially increase classification accuracy of the model or improve its interpretability. Objectives: We have introduced a guideline for determination of the sample size needed to reliably estimate the impact of a new feature. Methods/Approach: Our approach is based on the feature evaluation measure ReliefF and the bootstrap-based estimation of confidence intervals for feature ranks. Results: We test our approach using real world qualitative business-to-business sales forecasting data and two UCI data sets, one with missing values. The results show that new features with a high or a low rank can be detected using a relatively small number of instances, but features ranked near the border of useful features need larger samples to determine their impact. Conclusions: A combination of the feature evaluation measure ReliefF and the bootstrap-based estimation of confidence intervals can be used to reliably estimate the impact of a new feature in a given problem
- …