2,029 research outputs found

    Prosodic modules for speech recognition and understanding in VERBMOBIL

    Get PDF
    Within VERBMOBIL, a large project on spoken language research in Germany, two modules for detecting and recognizing prosodic events have been developed. One module operates on speech signal parameters and the word hypothesis graph, whereas the other module, designed for a novel, highly interactive architecture, only uses speech signal parameters as its input. Phrase boundaries, sentence modality, and accents are detected. The recognition rates in spontaneous dialogs are for accents up to 82,5%, for phrase boundaries up to 91,7%

    Strategies for focal accent detection in spontaneous speech

    Get PDF
    In this paper a new method for detection of focus is developed. Speech data consists of German spontaneous speech from several speakers. At present the algorithm uses only the fundamental frequency values. By computing a nonlinear reference line through significant anchor points in the F_{0} course, points of highest prominence are determined. The global recognition rate is 78,5% and the mean recognition rate is 66,6%

    Emotional Speech Perception Unfolding in Time: The Role of the Basal Ganglia

    Get PDF
    The basal ganglia (BG) have repeatedly been linked to emotional speech processing in studies involving patients with neurodegenerative and structural changes of the BG. However, the majority of previous studies did not consider that (i) emotional speech processing entails multiple processing steps, and the possibility that (ii) the BG may engage in one rather than the other of these processing steps. In the present study we investigate three different stages of emotional speech processing (emotional salience detection, meaning-related processing, and identification) in the same patient group to verify whether lesions to the BG affect these stages in a qualitatively different manner. Specifically, we explore early implicit emotional speech processing (probe verification) in an ERP experiment followed by an explicit behavioral emotional recognition task. In both experiments, participants listened to emotional sentences expressing one of four emotions (anger, fear, disgust, happiness) or neutral sentences. In line with previous evidence patients and healthy controls show differentiation of emotional and neutral sentences in the P200 component (emotional salience detection) and a following negative-going brain wave (meaning-related processing). However, the behavioral recognition (identification stage) of emotional sentences was impaired in BG patients, but not in healthy controls. The current data provide further support that the BG are involved in late, explicit rather than early emotional speech processing stages

    Focus accent, word length and position as cues to L1 and L2 word recognition

    Get PDF
    The present study examines native and nonnative perceptual processing of semantic information conveyed by prosodic prominence. Five groups of German learners of English each listened to one of 5 experimental conditions. Three conditions differed in place of focus accent in the sentence and two conditions were with spliced stimuli. The experiment condition was presented first in the learners’ L1 (German) and then in a similar set in the L2 (English). The effect of the accent condition and of the length and position of the target in the sentence was evaluated in a probe recognition task. In both the L1 and L2 tasks there was no significant effect in any of the five focus conditions. Target position and target word length had an effect in the L1 task. Word length did not affect accuracy rates in the L2 task. For probe recognition in the L2, word length and the position of the target interacted with the focus condition

    Prosody, focus, and focal structure : some remarks on methodology

    Get PDF
    Prosody falls between several established fields as e.g. phonetics, phonology, syntax, and dialogue structure. It is therefore prone to misconceptions: often, its relevancy is overestimated, and often, it is underestimated. The traditional method in linguistics in general and in phonology in particular is the construction and evaluation of sometimes rather complex examples based on the intuition of the linguist. This intuition is replaced by more or less naive and thus non-expert subjects and inferential statistics in experimental phonetics but the examples, i.e. the experimental material, are often rather complex as well. It is a truism that in both cases, conclusions are made on an "as if\u27; basis: as if a final proof had been found that the phenomenon A really exists regularily in the language B. In fact, it only can be proven that the phenomenon A sometimes can be detected in the production of some speakers of a variety of language B. This dilemma matters if prosody has to be put into practice, e.g. in automatic speech and language processing. In this field, large speech databases are already available for English and will be available for other languages as e.g. German in the near future. At least in the beginning, the problems that can - hopefully - be solved with the help of such databases might look trivial and thus not interesting - a step backwards and not forwards. "As if\u27; statements (concerning, e.g., narrow vs. broad focus) and problems that are trivial at face value (concerning, e.g., the relationship between phrasing units and accentuation and the ontology of sentence accent) will be illustrated with own material. I will argue that such trivial problems have to be dealt with in the beginning, and that they can constitute the very basis for the proper treatment of more far reaching and complex problems

    Lesion Loci of Impaired Affective Prosody: A Systematic Review of Evidence from Stroke

    Get PDF
    Affective prosody, or the changes in rate, rhythm, pitch, and loudness that convey emotion, has long been implicated as a function of the right hemisphere (RH), yet there is a dearth of literature identifying the specific neural regions associated with its processing. The current systematic review aimed to evaluate the evidence on affective prosody localization in the RH. One hundred and ninety articles from 1970 to February 2020 investigating affective prosody comprehension and production in patients with focal brain damage were identified via database searches. Eleven articles met inclusion criteria, passed quality reviews, and were analyzed for affective prosody localization. Acute, subacute, and chronic lesions demonstrated similar profile characteristics. Localized right antero-superior (i.e., dorsal stream) regions contributed to affective prosody production impairments, whereas damage to more postero-lateral (i.e., ventral stream) regions resulted in affective prosody comprehension deficits. This review provides support that distinct RH regions are vital for affective prosody comprehension and production, aligning with literature reporting RH activation for affective prosody processing in healthy adults as well. The impact of study design on resulting interpretations is discussed

    Playing With Fire Compounds: The Tonal Accents of Compounds in (North) Norwegian Preschoolers’ Role-Play Register

    Get PDF
    Prosodic features are some of the most salient features of dialect variation in Norway. It is therefore no wonder that the switch in prosodic systems is what is first recognized by caretakers and scholars when Norwegian children code-switch to something resembling the dialect of the capital (henceforth Urban East Norwegian, UEN) in role-play. With a focus on the system of lexical tonal accents, this paper investigates the spontaneous speech of North Norwegian children engaging in peer social role-play. By investigating F0 contours extracted from a corpus of spontaneous peer play, and comparing them with elicited baseline reference contours, this paper makes the case that children fail to apply the target tonal accent consistent with UEN in compounds in role-play, although the production of tonal accents otherwise seems to be phonetically target like UEN. Put in other words, they perform in accordance with UEN phonetics, but not UEN morpho-phonology

    Production, perception and online processing of prominence in the post-focal domain

    Get PDF
    This dissertation presents a fundamentally new and in-depth investigation of the distribution of prominence in different focal structures in two varieties of Italian (the one spoken in Udine and the one spoken in Bari), by means of the implementation of a categorical analysis with the continuous prosodic parameters related to F0 and periodic energy. Results provide evidence of the fact that prominence in these varieties of Italian is conveyed by both a categorical three-way distinction and a gradual modulation: absence or presence of pitch movement in the distinction between background (post-focal position) and the focal conditions, and a gradual modification of energy and duration. The degree of prominence of words occurring in different focal structures was also investigated in perception. The reportedly different distribution of prominence found in questions for the variety of Italian spoken in Bari is shown to have an influence in the degree of perceived prominence. This influence is found in the comparison between prominence’s ratings of Bari and Udine native speakers, as well as of Bari native speakers and German native speakers, with Italian as L2. Furthermore, the present dissertation tests the real-time processing of the pitch excursion registered in the post-focal region of questions in the Bari variety. Findings confirmed that the fine-grained changes in prominence are processed in real time. Moreover, results indicate that top-down expectations play a crucial role in modulating general cognitive processes. Overall, this thesis supports the view of prosodic prominence as characterised by a bundle of cues, probabilistically distributed in the listener’s perceptual space, which form top-down expectations that play a role both in offline perception and in online processing. Signal-based factors also play a role in perception and online processing, but can however be overridden by expectations
    • …
    corecore