Search CORE

2,764 research outputs found

THE CHILD AND THE WORLD: How Children acquire Language

Author: Allott Robin
Publication venue: Able Publishers
Publication date: 01/01/2005
Field of study

HOW CHILDREN ACQUIRE LANGUAGE Over the last few decades research into child language acquisition has been revolutionized by the use of ingenious new techniques which allow one to investigate what in fact infants (that is children not yet able to speak) can perceive when exposed to a stream of speech sound, the discriminations they can make between different speech sounds, differentspeech sound sequences and different words. However on the central features of the mystery, the extraordinarily rapid acquisition of lexicon and complex syntactic structures, little solid progress has been made. The questions being researched are how infants acquire and produce the speech sounds (phonemes) of the community language; how infants find words in the stream of speech; and how they link words to perceived objects or action, that is, discover meanings. In a recent general review in Nature of children's language acquisition, Patricia Kuhl also asked why we do not learn new languages as easily at 50 as at 5 and why computers have not cracked the human linguistic code. The motor theory of language function and origin makes possible a plausible account of child language acquisition generally from which answers can be derived also to these further questions. Why computers so far have been unable to 'crack' the language problem becomes apparent in the light of the motor theory account: computers can have no natural relation between words and their meanings; they have no conceptual store to which the network of words is linked nor do they have the innate aspects of language functioning - represented by function words; computers have no direct links between speech sounds and movement patterns and they do not have the instantly integrated neural patterning underlying thought - they necessarily operate serially and hierarchically. Adults find the acquisition of a new language much more difficult than children do because they are already neurally committed to the link between the words of their first language and the elements in their conceptual store. A second language being acquired by an adult is in direct competition for neural space with the network structures established for the first language

CogPrints Cognitive Sciences Eprint Archive

Post-stroke pure apraxia of speech – A rare experience

Author: Pietrzyk-Krawczyk Iwona
Polanowska Katarzyna Ewa
Publication venue: 'Elsevier BV'
Publication date: 01/01/1970
Field of study

Apraxia of speech (AOS) is a motor speech disorder, most typically caused by stroke, which in its “pure” form (without other speech-language deficits) is very rare in clinical practice. Because some observable characteristics of AOS overlap with more common verbal communication neurologic syndromes (i.e. aphasia, dysarthria) distinguishing them may be difficult. The present study describes AOS in a 49-year-old right-handed male after left-hemispheric stroke. Analysis of his articulatory and prosodic abnormalities in the context of intact communicative abilities as well as description of symptoms dynamics over time provides valuable information for clinical diagnosis of this specific disorder and prognosis for its recovery. This in turn is the basis for the selection of appropriate rehabilitative interventions

Via Medica Journals

Treatment of non-fluent aphasia through melody, rhythm and formulaic language

Author: Stahl B.
Publication venue: Max Planck Institute for Human Cognitive and Brain Sciences
Publication date: 01/04/2013
Field of study

Left-hemisphere stroke patients often suffer a profound loss of spontaneous speech — known as non-fluent aphasia. Yet, many patients are still able to sing entire pieces of text fluently. This striking finding has inspired mainly two research questions. If the experimental design focuses on one point in time (cross section), one may ask whether or not singing facilitates speech production in aphasic patients. If the design focuses on changes over several points in time (longitudinal section), one may ask whether or not singing qualifies as a therapy to aid recovery from aphasia. The present work addresses both of these questions based on two separate experiments. A cross-sectional experiment investigated the relative effects of melody, rhythm, and lyric type on speech production in seventeen patients with non-fluent aphasia. The experiment controlled for vocal frequency variability, pitch accuracy, rhythmicity, syllable duration, phonetic complexity and other influences, such as learning effects and the acoustic setting. Contrary to earlier reports, the cross-sectional results suggest that singing may not benefit speech production in non-fluent aphasic patients over and above rhythmic speech. Previous divergent findings could very likely be due to affects from the acoustic setting, insufficient control for syllable duration, and language-specific stress patterns. However, the data reported here indicate that rhythmic pacing may be crucial, particularly for patients with lesions including the basal ganglia. Overall, basal ganglia lesions accounted for more than fifty percent of the variance related to rhythmicity. The findings suggest that benefits typically attributed to singing in the past may actually have their roots in rhythm. Moreover, the results demonstrate that lyric type may have a profound impact on speech production in non-fluent aphasic patients. Among the studied patients, lyric familiarity and formulaic language appeared to strongly mediate speech production, regardless of whether patients were singing or speaking rhythmically. Lyric familiarity and formulaic language may therefore help to explain effects that have, up until now, been presumed to result from singing. A longitudinal experiment investigated the relative long-term effects of melody and rhythm on the recovery of formulaic and non-formulaic speech. Fifteen patients with chronic non-fluent aphasia underwent either singing therapy, rhythmic therapy, or standard speech therapy. The experiment controlled for vocal frequency variability, phonatory quality, pitch accuracy, syllable duration, phonetic complexity and other influences, such as the acoustic setting and learning effects induced by the testing itself. The longitudinal results suggest that singing and rhythmic speech may be similarly effective in the treatment of non-fluent aphasia. Both singing and rhythmic therapy patients made good progress in the production of common, formulaic phrases — known to be supported by right corticostriatal brain areas. This progress occurred at an early stage of both therapies and was stable over time. Moreover, relatives of the patients reported that they were using a fixed number of formulaic phrases successfully in communicative contexts. Independent of whether patients had received singing or rhythmic therapy, they were able to easily switch between singing and rhythmic speech at any time. Conversely, patients receiving standard speech therapy made less progress in the production of formulaic phrases. They did, however, improve their production of unrehearsed, non-formulaic utterances, in contrast to singing and rhythmic therapy patients, who did not. In light of these results, it may be worth considering the combined use of standard speech therapy and the training of formulaic phrases, whether sung or rhythmically spoken. This combination may yield better results for speech recovery than either therapy alone. Overall, treatment and lyric type accounted for about ninety percent of the variance related to speech recovery in the data reported here. The present work delivers three main results. First, it may not be singing itself that aids speech production and speech recovery in non-fluent aphasic patients, but rhythm and lyric type. Second, the findings may challenge the view that singing causes a transfer of language function from the left to the right hemisphere. Moving beyond this left-right hemisphere dichotomy, the current results are consistent with the idea that rhythmic pacing may partly bypass corticostriatal damage. Third, the data support the claim that non-formulaic utterances and formulaic phrases rely on different neural mechanisms, suggesting a two-path model of speech recovery. Standard speech therapy focusing on non-formulaic, propositional utterances may engage, in particular, left perilesional brain regions, while training of formulaic phrases may open new ways of tapping into right-hemisphere language resources — even without singing

MPG.PuRe

Modelling the effects of speech rate variation for automatic speech recognition

Author: Wrede Britta
Publication venue: Bielefeld University
Publication date: 01/01/2002
Field of study

Wrede B. Modelling the effects of speech rate variation for automatic speech recognition. Bielefeld (Germany): Bielefeld University; 2002.In automatic speech recognition it is a widely observed phenomenon that variations in speech rate cause severe degradations of the speech recognition performance. This is due to the fact that standard stochastic based speech recognition systems specialise on average speech rate. Although many approaches to modelling speech rate variation have been made, an integrated approach in a substantial system still has be to developed. General approaches to rate modelling are based on rate dependent models which are trained with rate specific subsets of the training data. During decoding a signal based rate estimation is performed according to which the set of rate dependent models is selected. While such approaches are able to reduce the word error rate significantly, they suffer from shortcomings such as the reduction of training data and the expensive training and decoding procedure. However, phonetic investigations show that there is a systematic relationship between speech rate and the acoustic characteristics of speech. In fast speech a tendency of reduction can be observed which can be described in more detail as a centralisation effect and an increase in coarticulation. Centralisation means that the formant frequencies of vowels tend to shift towards the vowel space center while increased coarticulation denotes the tendency of the spectral features of a vowel to shift towards those of its phonemic neighbour. The goal of this work is to investigate the possibility to incorporate the knowledge of the systematic nature of the influence of speech rate variation on the acoustic features in speech rate modelling. In an acoustic-phonetic analysis of a large corpus of spontaneous speech it was shown that an increased degree of the two effects of centralisation and coarticulation can be found in fast speech. Several measures for these effects were developed and used in speech recognition experiments with rate dependent models. A thorough investigation of rate dependent models showed that with duration and coarticulation based measures significant increases of the performance could be achieved. It was shown that by the use of different measures the models were adapted either to centralisation or coarticulation. Further experiments showed that by a more detailed modelling with more rate classes a further improvement can be achieved. It was also observed that a general basis for the models is needed before rate adaptation can be performed. In a comparison to other sources of acoustic variation it was shown that the effects of speech rate are as severe as those of speaker variation and environmental noise. All these results show that for a more substantial system that models rate variations accurately it is necessary to focus on both, durational and spectral effects. The systematic nature of the effects indicates that a continuous modelling is possible

Publications at Bielefeld University

Human vocal attractiveness as signaled by body size projection

Author: Birkholz P
Lee A
Liu X
Wu W-L
Xu Y
Publication venue
Publication date: 01/01/2013
Field of study

Voice, as a secondary sexual characteristic, is known to affect the perceived attractiveness of human individuals. But the underlying mechanism of vocal attractiveness has remained unclear. Here, we presented human listeners with acoustically altered natural sentences and fully synthetic sentences with systematically manipulated pitch, formants and voice quality based on a principle of body size projection reported for animal calls and emotional human vocal expressions. The results show that male listeners preferred a female voice that signals a small body size, with relatively high pitch, wide formant dispersion and breathy voice, while female listeners preferred a male voice that signals a large body size with low pitch and narrow formant dispersion. Interestingly, however, male vocal attractiveness was also enhanced by breathiness, which presumably softened the aggressiveness associated with a large body size. These results, together with the additional finding that the same vocal dimensions also affect emotion judgment, indicate that humans still employ a vocal interaction strategy used in animal calls despite the development of complex language

CiteSeerX

Directory of Open Access Journals

UCL Discovery

PubMed Central

Publikationsserver der RWTH Aachen University

HKU Scholars Hub

FigShare

The Production of Emotional Prosdy in Varying Severities of Apraxia of Speech

Author: Van Putten Steffany M.
Publication venue: DigitalCommons@UMaine
Publication date: 01/05/2001
Field of study

One mild AOS, one moderate AOS and one control speaker were asked to produce utterances with different emotional intent. In Experiment 1, the three subjects were asked to produce sentences with a happy, sad, or neutral intent through a repetition task. In Experiment 2, the three subjects were asked to produce sentences with either a happy or sad intent through a picture elicitation task. Paired t-tests comparing data from the acoustic analyses of each subject\u27s utterances revealed significant differences between FO, duration, and intensity characteristics between the happy and sad sentences of the control speaker. There were no significant differences in the acoustic characteristics of the productions of the AOS speakers suggesting that the AOS subjects were unable to volitionally produce acoustic parameters that help convey emotion. Two more experiments were designed to determine if näive listeners could hear the acoustic cues to signal emotion in all three speakers. In Experiment 3, näive listeners were asked to identify the sentences produced in Experiment 1 as happy, sad, or neutral. In Experiment 4, näive listeners were asked to identify the sentences produced in Experiment 2 as either happy or sad. Chi-square findings revealed that the naive listeners were able to identify the emotional differences of the control speaker and the correct identification was not by chance. The näive listeners could not distinguish between the emotional utterances of the mild or moderate AOS speakers. Higher percentages of correct identification in certain sentences over others were artifacts attributed to either chance (the näive listeners were guessing) or a response strategy (when in doubt, the naive listeners chose neutral or sad). The findings from Exp. 3 & 4 corroborate the acoustic findings from Exp. 1 & 2. In addition to the 4 structured experiments, spontaneous samples of happy, sad, and neutral utterances were collected and compared to those sentences produced in Experiments 1 & 2. Comparisons between the elicited and spontaneous sentences indicated that the moderate AOS subject was able to produce variations of FO and duration similar to those variations that would be produced by normal speakers conveying emotion (Banse & Scherer, 1996; Lieberman & Michaels, 1962; Scherer, 1988). The mild AOS subject was unable to produce prosodic differences between happy and sad emotion. This study found that although these AOS subjects were unable to produce acoustic parameters during elicited speech that signal emotion, they were able to produce some more variation in the acoustic properties of FO and duration, especially in the moderate AOS speaker. However, any meaningful variation pattern that would convey emotion (such as seen in the control subject) were not found. These findings suggest that the AOS subjects probably convey emotion non-verbally (e.g., facial expression, muscle tension, body language)

University of Maine

Explaining the PENTA model: a reply to Arvaniti and Ladd

Author: Albert Lee
Beckman
Bolinger
Broe
de Jong
Fang Liu
Gussenhoven
Heldner
Liu
Nick
O'Connor
Pierrehumbert
Pierrehumbert
Santitham Prom-on
Wang
Xu
Xu
Xu
Xu
Xu
Xu
Xu
Yi Xu
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2015
Field of study

This paper presents an overview of the Parallel Encoding and Target Approximation (PENTA) model of speech prosody, in response to an extensive critique by Arvaniti & Ladd (2009). PENTA is a framework for conceptually and computationally linking communicative meanings to fine-grained prosodic details, based on an articulatory-functional view of speech. Target Approximation simulates the articulatory realisation of underlying pitch targets – the prosodic primitives in the framework. Parallel Encoding provides an operational scheme that enables simultaneous encoding of multiple communicative functions. We also outline how PENTA can be computationally tested with a set of software tools. With the help of one of the tools, we offer a PENTA-based hypothetical account of the Greek intonational patterns reported by Arvaniti & Ladd, showing how it is possible to predict the prosodic shapes of an utterance based on the lexical and postlexical meanings it conveys

University of Essex Research Repository

Central Archive at the University of Reading

Crossref

HKU Scholars Hub

An exploration of the rhythm of Malay

Author: Docherty G. J
Samoylova Ekaterina
Wan Ahmad Wan Aslynn Salwani
Publication venue
Publication date: 01/01/2010
Field of study

In recent years there has been a surge of interest in speech rhythm. However we still lack a clear understanding of the nature of rhythm and rhythmic differences across languages. Various metrics have been proposed as means for measuring rhythm on the phonetic level and making typological comparisons between languages (Ramus et al, 1999; Grabe & Low, 2002; Dellwo, 2006) but the debate is ongoing on the extent to which these metrics capture the rhythmic basis of speech (Arvaniti, 2009; Fletcher, in press). Furthermore, cross linguistic studies of rhythm have covered a relatively small number of languages and research on previously unclassified languages is necessary to fully develop the typology of rhythm. This study examines the rhythmic features of Malay, for which, to date, relatively little work has been carried out on aspects rhythm and timing. The material for the analysis comprised 10 sentences produced by 20 speakers of standard Malay (10 males and 10 females). The recordings were first analysed using rhythm metrics proposed by Ramus et. al (1999) and Grabe & Low (2002). These metrics (∆C, %V, rPVI, nPVI) are based on durational measurements of vocalic and consonantal intervals. The results indicated that Malay clustered with other so-called syllable-timed languages like French and Spanish on the basis of all metrics. However, underlying the overall findings for these metrics there was a large degree of variability in values across speakers and sentences, with some speakers having values in the range typical of stressed-timed languages like English. Further analysis has been carried out in light of Fletcher’s (in press) argument that measurements based on duration do not wholly reflect speech rhythm as there are many other factors that can influence values of consonantal and vocalic intervals, and Arvaniti’s (2009) suggestion that other features of speech should also be considered in description of rhythm to discover what contributes to listeners’ perception of regularity. Spectrographic analysis of the Malay recordings brought to light two parameters that displayed consistency and regularity for all speakers and sentences: the duration of individual vowels and the duration of intervals between intensity minima. This poster presents the results of these investigations and points to connections between the features which seem to be consistently regulated in the timing of Malay connected speech and aspects of Malay phonology. The results are discussed in light of current debate on the descriptions of rhythm

The International Islamic University Malaysia Repository

Proceedings of the Sixteenth Australasian International Conference on Speech Science and Technology

Author
Publication venue: ASSTA
Publication date: 31/12/2016
Field of study

UCL Discovery