844 research outputs found
Neural Machine Translation into Language Varieties
Both research and commercial machine translation have so far neglected the
importance of properly handling the spelling, lexical and grammar divergences
occurring among language varieties. Notable cases are standard national
varieties such as Brazilian and European Portuguese, and Canadian and European
French, which popular online machine translation services are not keeping
distinct. We show that an evident side effect of modeling such varieties as
unique classes is the generation of inconsistent translations. In this work, we
investigate the problem of training neural machine translation from English to
specific pairs of language varieties, assuming both labeled and unlabeled
parallel texts, and low-resource conditions. We report experiments from English
to two pairs of dialects, EuropeanBrazilian Portuguese and European-Canadian
French, and two pairs of standardized varieties, Croatian-Serbian and
Indonesian-Malay. We show significant BLEU score improvements over baseline
systems when translation into similar languages is learned as a multilingual
task with shared representations.Comment: Published at EMNLP 2018: third conference on machine translation (WMT
2018
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview
We present a structured overview of adaptation algorithms for neural
network-based speech recognition, considering both hybrid hidden Markov model /
neural network systems and end-to-end neural network systems, with a focus on
speaker adaptation, domain adaptation, and accent adaptation. The overview
characterizes adaptation algorithms as based on embeddings, model parameter
adaptation, or data augmentation. We present a meta-analysis of the performance
of speech recognition adaptation algorithms, based on relative error rate
reductions as reported in the literature.Comment: Submitted to IEEE Open Journal of Signal Processing. 30 pages, 27
figure
ArTST: Arabic Text and Speech Transformer
We present ArTST, a pre-trained Arabic text and speech transformer for
supporting open-source speech technologies for the Arabic language. The model
architecture follows the unified-modal framework, SpeechT5, that was recently
released for English, and is focused on Modern Standard Arabic (MSA), with
plans to extend the model for dialectal and code-switched Arabic in future
editions. We pre-trained the model from scratch on MSA speech and text data,
and fine-tuned it for the following tasks: Automatic Speech Recognition (ASR),
Text-To-Speech synthesis (TTS), and spoken dialect identification. In our
experiments comparing ArTST with SpeechT5, as well as with previously reported
results in these tasks, ArTST performs on a par with or exceeding the current
state-of-the-art in all three tasks. Moreover, we find that our pre-training is
conducive for generalization, which is particularly evident in the low-resource
TTS task. The pre-trained model as well as the fine-tuned ASR and TTS models
are released for research use.Comment: 11 pages, 1 figure, SIGARAB ArabicNLP 202
On the Robustness of Arabic Speech Dialect Identification
Arabic dialect identification (ADI) tools are an important part of the
large-scale data collection pipelines necessary for training speech recognition
models. As these pipelines require application of ADI tools to potentially
out-of-domain data, we aim to investigate how vulnerable the tools may be to
this domain shift. With self-supervised learning (SSL) models as a starting
point, we evaluate transfer learning and direct classification from SSL
features. We undertake our evaluation under rich conditions, with a goal to
develop ADI systems from pretrained models and ultimately evaluate performance
on newly collected data. In order to understand what factors contribute to
model decisions, we carry out a careful human study of a subset of our data.
Our analysis confirms that domain shift is a major challenge for ADI models. We
also find that while self-training does alleviate this challenges, it may be
insufficient for realistic conditions
Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech
Language barriers present a great challenge in our increasingly connected and
global world. Especially within the medical domain, e.g. hospital or emergency
room, communication difficulties and delays may lead to malpractice and
non-optimal patient care. In the HYKIST project, we consider patient-physician
communication, more specifically between a German-speaking physician and an
Arabic- or Vietnamese-speaking patient. Currently, a doctor can call the
Triaphon service to get assistance from an interpreter in order to help
facilitate communication. The HYKIST goal is to support the usually
non-professional bilingual interpreter with an automatic speech translation
system to improve patient care and help overcome language barriers. In this
work, we present our ASR system development efforts for this conversational
telephone speech translation task in the medical domain for two languages
pairs, data collection, various acoustic model architectures and
dialect-induced difficulties.Comment: ASR System Paper for HYKIST projec
- …