3,083 research outputs found
The acquisition of English L2 prosody by Italian native speakers: experimental data and pedagogical implications
This paper investigates Yes-No question intonation patterns in English L2, Italian L1, and
English L1. The aim is to test the hypothesis that L2 learners may show different
acquisition strategies for different dimensions of intonation, and particularly the
phonological and phonetic components. The study analyses the nuclear intonation
contours of 4 target English words and 4 comparable Italian words consisting of sonorant
segments, stressed on the semi-final or final syllable, and occurring in Yes-No questions
in sentence-final position (e.g., Will you attend the memorial?, Hai sentito la Melania?).
The words were contained in mini-dialogues of question-answer pairs, and read 5 times
by 4 Italian speakers (Padova area, North-East Italy) and 3 English female speakers
(London area, UK). The results show that: 1) different intonation patterns may be used to
realize the same grammatical function; 2) different developmental processes are at work,
including transfer of L1 categories and the acquisition of L2 phonological categories.
These results suggest that the phonetic dimension of L2 intonation may be more difficult
to learn than the phonological one
Ordering the suggestions of a spellchecker without using context.
Having located a misspelling, a spellchecker generally offers some suggestions for the intended word. Even without using context, a spellchecker can draw on various types of information in ordering its suggestions. A series of experiments is described, beginning with a basic corrector that implements a well-known algorithm for reversing single simple errors, and making successive enhancements to take account of substring matches, pronunciation, known error patterns, syllable structure and word frequency. The improvement in the ordering produced by each enhancement is measured on a large corpus of misspellings. The final version is tested on other corpora against a widely used commercial spellchecker and a research prototype
Improving the translation environment for professional translators
When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side.
This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
Articulatory and bottleneck features for speaker-independent ASR of dysarthric speech
The rapid population aging has stimulated the development of assistive
devices that provide personalized medical support to the needies suffering from
various etiologies. One prominent clinical application is a computer-assisted
speech training system which enables personalized speech therapy to patients
impaired by communicative disorders in the patient's home environment. Such a
system relies on the robust automatic speech recognition (ASR) technology to be
able to provide accurate articulation feedback. With the long-term aim of
developing off-the-shelf ASR systems that can be incorporated in clinical
context without prior speaker information, we compare the ASR performance of
speaker-independent bottleneck and articulatory features on dysarthric speech
used in conjunction with dedicated neural network-based acoustic models that
have been shown to be robust against spectrotemporal deviations. We report ASR
performance of these systems on two dysarthric speech datasets of different
characteristics to quantify the achieved performance gains. Despite the
remaining performance gap between the dysarthric and normal speech, significant
improvements have been reported on both datasets using speaker-independent ASR
architectures.Comment: to appear in Computer Speech & Language -
https://doi.org/10.1016/j.csl.2019.05.002 - arXiv admin note: substantial
text overlap with arXiv:1807.1094
Recommended from our members
A deep learning approach to assessing non-native pronunciation of English using phone distances
The way a non-native speaker pronounces the phones of a language
is an important predictor of their proficiency. In grading
spontaneous speech, the pairwise distances between generative
statistical models trained on each phone have been shown to be
powerful features. This paper presents a deep learning alternative
to model-based phone distances in the form of a tunable
Siamese network feature extractor to extract distance metrics directly
from the audio frame sequence. Features are extracted at
the phone instance level and combined to phone-level representations
using an attention mechanism. Pair-wise distances between
phone features are then projected through a feed-forward
layer to predict score. The extraction stage is initialised on either
a binary phone instance-pair classification task, or to mimic
the model-based features, then the whole system is fine-tuned
end-to-end, optimising the learning of the distance metric to
the score prediction task. This method is therefore more adaptable
and more sensitive to phone instance level phenomena. Its
performance is compared agains
Reducing Audible Spectral Discontinuities
In this paper, a common problem in diphone synthesis is discussed, viz., the occurrence of audible discontinuities at diphone boundaries. Informal observations show that spectral mismatch is most likely the cause of this phenomenon.We first set out to find an objective spectral measure for discontinuity. To this end, several spectral distance measures are related to the results of a listening experiment. Then, we studied the feasibility of extending the diphone database with context-sensitive diphones to reduce the occurrence of audible discontinuities. The number of additional diphones is limited by clustering consonant contexts that have a similar effect on the surrounding vowels on the basis of the best performing distance measure. A listening experiment has shown that the addition of these context-sensitive diphones significantly reduces the amount of audible discontinuities
Multi-feature Based Chinese-English Named Entity Extraction from Comparable Corpora
PACLIC 20 / Wuhan, China / 1-3 November, 200
- …