32,388 research outputs found
Teaching Pronunciation from the Top Down
In this paper, a theoretical and pedagogical foundation for research efforts is provided. Pronunciation is examined from a contextual, "top-down" perspective from which segmental articulation assumes less importance than more general properties of speech such as rhythm and voice quality. Pronunciation is described as conveying many different types of messages to a hearer related to the information structure of a discourse, the speaker's attitude and mood, and other social and psychological features of the speaker or of the relationship between the speaker and hearer. Moreover, various aspects of pronunciation are shown to relate to specific gestures.
The aim is to present a more descriptively enlightening and pedagogically useful characterization of second language phonology than traditional treatments, in which phonology was identified with discrete articulations and in which suprasegmental features were relegated to the periphery of language per se, i.e., to the paralinguistic and in some cases the extralinguistic domains of communication. Suggestions for teaching pronunciation are set in a context of research and theory, and a focus on the non-segmental characteristics of speech is advocated. This discussion makes reference to the use of video and computer media in pronunciation training (see Pennington forthcoming for further discussion), as well as to the use of more traditional types of audiovisual aids. The paper concludes with a set of research questions on pronunciation instruction derived from this investigation
Characterizing intonation deficit in motor speech disorders : an autosegmental-metrical analysis of spontaneous speech in hypokinetic dysarthria, ataxic dysarthria and foreign accent syndrome
The autosegmental-metrical (AM) framework represents an established methodology for intonational analysis in unimpaired speaker populations but has found little application in describing intonation in motor speech disorders (MSDs). This study compared the intonation patterns of unimpaired participants (CON) and those with Parkinson's disease (PD), ataxic dysarthria (AT), and foreign accent syndrome (FAS) to evaluate the approach's potential for distinguishing types of MSDs from each other and from unimpaired speech. Spontaneous speech from 8 PD, 8 AT, 4 FAS, and 10 CON speakers were analyzed in relation to inventory and prevalence of pitch patterns, accentuation, and phrasing. Acoustic-phonetic baseline measures (maximum-phonation-duration, speech rate, and F0-variability) were also performed. Results: The analyses yielded differences between MSD and CON groups and between the clinical groups in regard to prevalence, accentuation, and phrasing. AT and FAS speakers used more rising and high pitch accents than PD and CON speakers. The AT group used the highest number of pitch accents per phrase, and all 3 MSD groups produced significantly shorter phrases than the CON group. The study succeeded in differentiating MSDs on the basis of intonational performances by using the AM approach, thus, demonstrating its potential for charting intonational profiles in clinical populations
Recommended from our members
Establishing diagnostic criteria: the role of clinical pragmatics
The study of pragmatic disorders is of interest to speech-language pathologists who have a professional responsibility to assess and treat communication impairments. However, these disorders, it will be argued in this paper, have a significance beyond the clinical management of clients with communication impairments. Specifically, pragmatic disorders can now make a contribution to the diagnosis of a range of clinical conditions in which communication is adversely affected. These conditions include attention deficit hyperactivity disorder (ADHD), the autistic spectrum disorders, schizophrenia and the dementias. Pragmatic disorders are already among the criteria used to diagnose some of these conditions (e.g. ADHD), although they are not described in these terms. In other conditions (e.g. the dementias), pragmatic disorders have potential diagnostic value in the absence of reliable biomarkers markers of these conditions and similar initial presenting symptoms. Using clinical data, and the findings of empirical studies, the case is made for the inclusion and/or greater integration of pragmatic disorders in the formal classificatory systems that are used to diagnose a range of disorders. A previously unrecognised role for pragmatic impairments in the nosology and diagnosis of clinical disorders is thereby established
Investigation of Frame Alignments for GMM-based Digit-prompted Speaker Verification
Frame alignments can be computed by different methods in GMM-based speaker
verification. By incorporating a phonetic Gaussian mixture model (PGMM), we are
able to compare the performance using alignments extracted from the deep neural
networks (DNN) and the conventional hidden Markov model (HMM) in digit-prompted
speaker verification. Based on the different characteristics of these two
alignments, we present a novel content verification method to improve the
system security without much computational overhead. Our experiments on the
RSR2015 Part-3 digit-prompted task show that, the DNN based alignment performs
on par with the HMM alignment. The results also demonstrate the effectiveness
of the proposed Kullback-Leibler (KL) divergence based scoring to reject speech
with incorrect pass-phrases.Comment: accepted by APSIPA ASC 201
Finding the Most Uniform Changes in Vowel Polygon Caused by Psychological Stress
Using vowel polygons, exactly their parameters, is chosen as the criterion for achievement of differences between normal state of speaker and relevant speech under real psychological stress. All results were experimentally obtained by created software for vowel polygon analysis applied on ExamStress database. Selected 6 methods based on cross-correlation of different features were classified by the coefficient of variation and for each individual vowel polygon, the efficiency coefficient marking the most significant and uniform differences between stressed and normal speech were calculated. As the best method for observing generated differences resulted method considered mean of cross correlation values received for difference area value with vector length and angle parameter couples. Generally, best results for stress detection are achieved by vowel triangles created by /i/-/o/-/u/ and /a/-/i/-/o/ vowel triangles in formant planes containing the fifth formant F5 combined with other formants
Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification
There are a number of studies about extraction of bottleneck (BN) features
from deep neural networks (DNNs)trained to discriminate speakers, pass-phrases
and triphone states for improving the performance of text-dependent speaker
verification (TD-SV). However, a moderate success has been achieved. A recent
study [1] presented a time contrastive learning (TCL) concept to explore the
non-stationarity of brain signals for classification of brain states. Speech
signals have similar non-stationarity property, and TCL further has the
advantage of having no need for labeled data. We therefore present a TCL based
BN feature extraction method. The method uniformly partitions each speech
utterance in a training dataset into a predefined number of multi-frame
segments. Each segment in an utterance corresponds to one class, and class
labels are shared across utterances. DNNs are then trained to discriminate all
speech frames among the classes to exploit the temporal structure of speech. In
addition, we propose a segment-based unsupervised clustering algorithm to
re-assign class labels to the segments. TD-SV experiments were conducted on the
RedDots challenge database. The TCL-DNNs were trained using speech data of
fixed pass-phrases that were excluded from the TD-SV evaluation set, so the
learned features can be considered phrase-independent. We compare the
performance of the proposed TCL bottleneck (BN) feature with those of
short-time cepstral features and BN features extracted from DNNs discriminating
speakers, pass-phrases, speaker+pass-phrase, as well as monophones whose labels
and boundaries are generated by three different automatic speech recognition
(ASR) systems. Experimental results show that the proposed TCL-BN outperforms
cepstral features and speaker+pass-phrase discriminant BN features, and its
performance is on par with those of ASR derived BN features. Moreover,....Comment: Copyright (c) 2019 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
Determination: a universal dimension for inter-language comparison : (preliminary version)
The basic idea I want to develop and to substantiate in this paper consists in replacing – where necessary – the traditional concept of linguistic category or linguistic relation understood as 'things', as reified hypostases, by the more dynamic concept of dimension. A dimension of language structure is not coterminous with one single category or relation but, instead, accommodates several of them. It corresponds to certain well circumscribed purposive functions of linguistic activity as well as to certain definite principles and techniques for satisfying these functions. The true universals of language are represented by these dimensions, principles, and techniques which constitute the true basis for non-historical inter-language comparison. The categories and relations used in grammar are condensations – hypostases as it were – of such dimensions, principles, and techniques. Elsewhere I have outlined the theory which I want to test here in a case study
A Linguistic Specification of Aesthetic Judgments
This paper aims to delineate the class of aesthetic judgments linguistically. The main idea is that aesthetic judgments can be specified by a certain set of assertibility conditions, i.e., by norms that govern appropriate speech-acts. This idea is spelled out in detail and defended against various objections. The suggestion leads to an interesting account of aesthetic judgments that is theoretically fruitful: It provides the basis for a non-circular and satisfying characterization of the whole domain of aesthetic research and it marks an important linguistic difference between aesthetic judgments and judgments of personal taste
- …