10,276 research outputs found
Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German
The goal of this work is to design a machine translation (MT) system for a
low-resource family of dialects, collectively known as Swiss German, which are
widely spoken in Switzerland but seldom written. We collected a significant
number of parallel written resources to start with, up to a total of about 60k
words. Moreover, we identified several other promising data sources for Swiss
German. Then, we designed and compared three strategies for normalizing Swiss
German input in order to address the regional diversity. We found that
character-based neural MT was the best solution for text normalization. In
combination with phrase-based statistical MT, our solution reached 36% BLEU
score when translating from the Bernese dialect. This value, however, decreases
as the testing data becomes more remote from the training one, geographically
and topically. These resources and normalization techniques are a first step
towards full MT of Swiss German dialects.Comment: 11th Language Resources and Evaluation Conference (LREC), 7-12 May
2018, Miyazaki (Japan
MultiMWE: building a multi-lingual multi-word expression (MWE) parallel corpora
Multi-word expressions (MWEs) are a hot topic in research in natural language processing (NLP), including topics such as MWE detection, MWE decomposition, and research investigating the exploitation of MWEs in other NLP fields such as Machine Translation. However, the availability of bilingual or multi-lingual MWE corpora is very limited. The only bilingual MWE corpora that we are aware of is from the PARSEME (PARSing and Multi-word Expressions) EU project. This is a small collection of only 871 pairs of English-German MWEs. In this paper, we present multi-lingual and bilingual MWE corpora that we have extracted from root parallel corpora. Our collections are 3,159,226 and 143,042 bilingual MWE pairs for German-English and Chinese-English respectively after filtering. We examine the quality of these extracted bilingual MWEs in MT experiments. Our initial experiments applying MWEs in MT show improved translation performances on MWE terms in qualitative analysis and better general evaluation scores in quantitative analysis, on both German-English and Chinese-English language pairs. We follow a standard experimental pipeline to create our MultiMWE corpora which are available online. Researchers can use this free corpus for their own models or use them in a knowledge base as model features
Improving the translation environment for professional translators
When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side.
This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
Filler model based confidence measures for spoken dialogue systems: a case study for Turkish
Because of the inadequate performance of speech recognition systems, an accurate confidence scoring mechanism should be employed to understand user requests correctly. To determine a confidence score for a hypothesis, certain confidence features are combined. The performance of filler-model based confidence features have been investigated. Five types of filler model networks were defined: triphone-network; phone-network; phone-class network; 5-state catch-all model; 3-state catch-all model. First, all models were evaluated in a Turkish speech recognition task in terms of their ability to tag correctly (recognition-error or correct) recognition hypotheses. The best performance was obtained from the triphone recognition network. Then, the performances of reliable combinations of these models were investigated and it was observed that certain combinations of filler models could significantly improve the accuracy of the confidence annotatio
A computational model for studying L1’s effect on L2 speech learning
abstract: Much evidence has shown that first language (L1) plays an important role in the formation of L2 phonological system during second language (L2) learning process. This combines with the fact that different L1s have distinct phonological patterns to indicate the diverse L2 speech learning outcomes for speakers from different L1 backgrounds. This dissertation hypothesizes that phonological distances between accented speech and speakers' L1 speech are also correlated with perceived accentedness, and the correlations are negative for some phonological properties. Moreover, contrastive phonological distinctions between L1s and L2 will manifest themselves in the accented speech produced by speaker from these L1s. To test the hypotheses, this study comes up with a computational model to analyze the accented speech properties in both segmental (short-term speech measurements on short-segment or phoneme level) and suprasegmental (long-term speech measurements on word, long-segment, or sentence level) feature space. The benefit of using a computational model is that it enables quantitative analysis of L1's effect on accent in terms of different phonological properties. The core parts of this computational model are feature extraction schemes to extract pronunciation and prosody representation of accented speech based on existing techniques in speech processing field. Correlation analysis on both segmental and suprasegmental feature space is conducted to look into the relationship between acoustic measurements related to L1s and perceived accentedness across several L1s. Multiple regression analysis is employed to investigate how the L1's effect impacts the perception of foreign accent, and how accented speech produced by speakers from different L1s behaves distinctly on segmental and suprasegmental feature spaces. Results unveil the potential application of the methodology in this study to provide quantitative analysis of accented speech, and extend current studies in L2 speech learning theory to large scale. Practically, this study further shows that the computational model proposed in this study can benefit automatic accentedness evaluation system by adding features related to speakers' L1s.Dissertation/ThesisDoctoral Dissertation Speech and Hearing Science 201
Recommended from our members
Towards automatic assessment of spontaneous spoken English
With increasing global demand for learning English as a second language, there has been considerable interest in
methods of automatic assessment of spoken language proficiency for use in interactive electronic learning tools as
well as for grading candidates for formal qualifications. This paper presents an automatic system to address the
assessment of spontaneous spoken language. Prompts or questions requiring spontaneous speech responses elicit
more natural speech which better reflects a learner’s proficiency level than read speech. In addition to the challenges
of highly variable non-native, learner, speech and noisy real-world recording conditions, this requires any automatic
system to handle disfluent, non-grammatical, spontaneous speech with the underlying text unknown. To handle these,
a strong deep learning based speech recognition system is applied in combination with a Gaussian Process (GP)
grader. A range of features derived from the audio using the recognition hypothesis are investigated for their efficacy
in the automatic grader. The proposed system is shown to predict grades at a similar level to the original examiner
graders on real candidate entries. Interpolation with the examiner grades further boosts performance. The ability to
reject poorly estimated grades is also important and measures are proposed to evaluate the performance of rejection
schemes. The GP variance is used to decide which automatic grades should be rejected. Back-off to an expert grader
for the least confident grades gives gains.Cambridge Assessment Englis
Transfer Learning for Speech Recognition on a Budget
End-to-end training of automated speech recognition (ASR) systems requires
massive data and compute resources. We explore transfer learning based on model
adaptation as an approach for training ASR models under constrained GPU memory,
throughput and training data. We conduct several systematic experiments
adapting a Wav2Letter convolutional neural network originally trained for
English ASR to the German language. We show that this technique allows faster
training on consumer-grade resources while requiring less training data in
order to achieve the same accuracy, thereby lowering the cost of training ASR
models in other languages. Model introspection revealed that small adaptations
to the network's weights were sufficient for good performance, especially for
inner layers.Comment: Accepted for 2nd ACL Workshop on Representation Learning for NL
- …