115 research outputs found

    Temporal entrainment in overlapping speech

    Get PDF
    Wlodarczak M. Temporal entrainment in overlapping speech. Bielefeld: Bielefeld University; 2014

    A Study of Accomodation of Prosodic and Temporal Features in Spoken Dialogues in View of Speech Technology Applications

    Get PDF
    Inter-speaker accommodation is a well-known property of human speech and human interaction in general. Broadly it refers to the behavioural patterns of two (or more) interactants and the effect of the (verbal and non-verbal) behaviour of each to that of the other(s). Implementation of thisbehavior in spoken dialogue systems is desirable as an improvement on the naturalness of humanmachine interaction. However, traditional qualitative descriptions of accommodation phenomena do not provide sufficient information for such an implementation. Therefore, a quantitativedescription of inter-speaker accommodation is required. This thesis proposes a methodology of monitoring accommodation during a human or humancomputer dialogue, which utilizes a moving average filter over sequential frames for each speaker. These frames are time-aligned across the speakers, hence the name Time Aligned Moving Average (TAMA). Analysis of spontaneous human dialogue recordings by means of the TAMA methodology reveals ubiquitous accommodation of prosodic features (pitch, intensity and speech rate) across interlocutors, and allows for statistical (time series) modeling of the behaviour, in a way which is meaningful for implementation in spoken dialogue system (SDS) environments.In addition, a novel dialogue representation is proposed that provides an additional point of view to that of TAMA in monitoring accommodation of temporal features (inter-speaker pause length and overlap frequency). This representation is a percentage turn distribution of individual speakercontributions in a dialogue frame which circumvents strict attribution of speaker-turns, by considering both interlocutors as synchronously active. Both TAMA and turn distribution metrics indicate that correlation of average pause length and overlap frequency between speakers can be attributed to accommodation (a debated issue), and point to possible improvements in SDS “turntaking” behaviour. Although the findings of the prosodic and temporal analyses can directly inform SDS implementations, further work is required in order to describe inter-speaker accommodation sufficiently, as well as to develop an adequate testing platform for evaluating the magnitude ofperceived improvement in human-machine interaction. Therefore, this thesis constitutes a first step towards a convincingly useful implementation of accommodation in spoken dialogue systems

    English Index

    Get PDF
    No abstract

    Proceedings of the VIIth GSCP International Conference

    Get PDF
    The 7th International Conference of the Gruppo di Studi sulla Comunicazione Parlata, dedicated to the memory of Claire Blanche-Benveniste, chose as its main theme Speech and Corpora. The wide international origin of the 235 authors from 21 countries and 95 institutions led to papers on many different languages. The 89 papers of this volume reflect the themes of the conference: spoken corpora compilation and annotation, with the technological connected fields; the relation between prosody and pragmatics; speech pathologies; and different papers on phonetics, speech and linguistic analysis, pragmatics and sociolinguistics. Many papers are also dedicated to speech and second language studies. The online publication with FUP allows direct access to sound and video linked to papers (when downloaded)

    Gesture and Speech in Interaction - 4th edition (GESPIN 4)

    Get PDF
    International audienceThe fourth edition of Gesture and Speech in Interaction (GESPIN) was held in Nantes, France. With more than 40 papers, these proceedings show just what a flourishing field of enquiry gesture studies continues to be. The keynote speeches of the conference addressed three different aspects of multimodal interaction:gesture and grammar, gesture acquisition, and gesture and social interaction. In a talk entitled Qualitiesof event construal in speech and gesture: Aspect and tense, Alan Cienki presented an ongoing researchproject on narratives in French, German and Russian, a project that focuses especially on the verbal andgestural expression of grammatical tense and aspect in narratives in the three languages. Jean-MarcColletta's talk, entitled Gesture and Language Development: towards a unified theoretical framework,described the joint acquisition and development of speech and early conventional and representationalgestures. In Grammar, deixis, and multimodality between code-manifestation and code-integration or whyKendon's Continuum should be transformed into a gestural circle, Ellen Fricke proposed a revisitedgrammar of noun phrases that integrates gestures as part of the semiotic and typological codes of individuallanguages. From a pragmatic and cognitive perspective, Judith Holler explored the use ofgaze and hand gestures as means of organizing turns at talk as well as establishing common ground in apresentation entitled On the pragmatics of multi-modal face-to-face communication: Gesture, speech andgaze in the coordination of mental states and social interaction.Among the talks and posters presented at the conference, the vast majority of topics related, quitenaturally, to gesture and speech in interaction - understood both in terms of mapping of units in differentsemiotic modes and of the use of gesture and speech in social interaction. Several presentations explored the effects of impairments(such as diseases or the natural ageing process) on gesture and speech. The communicative relevance ofgesture and speech and audience-design in natural interactions, as well as in more controlled settings liketelevision debates and reports, was another topic addressed during the conference. Some participantsalso presented research on first and second language learning, while others discussed the relationshipbetween gesture and intonation. While most participants presented research on gesture and speech froman observer's perspective, be it in semiotics or pragmatics, some nevertheless focused on another importantaspect: the cognitive processes involved in language production and perception. Last but not least,participants also presented talks and posters on the computational analysis of gestures, whether involvingexternal devices (e.g. mocap, kinect) or concerning the use of specially-designed computer software forthe post-treatment of gestural data. Importantly, new links were made between semiotics and mocap data

    Pragmatics, Prosody, and Social Skills of School-Age Children with Language-Learning Differences

    Get PDF
    Social skills are an important aspect of child development that continues to have influences in adolescence and adulthood (Hart, Olsen, Robinson, & Mandleco, 1997). Interacting in a social world requires an integration of many abilities that include social skills and emotional understanding of oneself and other persons. Children who have difficulties with interpreting social cues (e.g., identifying basic emotions and responding to cues in speech) have immediate and progressive consequences in both academics and social living. Children with typical language skills are successfully interacting with peers and acknowledging social rules for different environments (e.g., playing at school vs. playing at home). In contrast, children with language impairments struggle with using social skills that result in negative experiences in peer interactions (Horowitz, Jansson, Ljungberg, & Hedenbro, 2006). This study explored the social profiles of second grade children with a range of language abilities (e.g., children with low and high levels of language) as they interpret emotions in speech and narrative tasks. Multiple informants (i.e., parents, teachers, speech-language pathologist, and peers) evaluated social skills from different perspectives. A multi-interactional approach explained children’s social-emotional development from three theoretical perspectives: pragmatics, cognition, and emotional understanding. Forty-one second grade children completed a battery of tests that evaluated cognitive measures, language ability, and social skills. Each participant completed three experimental tasks (perception, imitation, and narrative) that examined how children process emotional cues in speech and narratives. A sociometric classification profiled children’s social skills and peer relationships. Results indicated that children with a range of language abilities (i.e., children with low and high levels of language skills) processed emotional cues in speech. Four acoustic patterns significantly related to how children differentiate emotions in speech. Additionally, language ability was a significant factor in the ability to infer emotions in narratives and judge social skills. Children with high language scores were more liked by peers and received better ratings on the teacher questionnaires. This study provides preliminary evidence that children with low and high levels of language abilities are able to interpret emotional cues in speech but differed in the ability to infer emotions in narratives

    Gesture and prosody: cognitive and communicative effort in L1 and L2

    Get PDF
    Gestures and speech are two interconnected features of human communication. Studies have shown that they develop together and they both covey semantic and pragmatic meanings (Kendon, 2004; McNeill, 1992). Furthermore, gestures seem to be also linked to the prosodic features of speech since they also share synchronicity aspects and are often temporally aligned (McClave, 1991; Esteve-Gibert & Prieto, 2013). The co-production of gestures and speech, thus, seems to have a number of different functions both at the cognitive and at the communicative levels. On the one hand, in fact, gestures seem to help the process of speaking: they have a scaffolding function in the lexical and rhythmical organization of speech and they help information packaging (Butterworth & Beattie, 1978; Esteve-Gibert et al., 2014; Krauss et al, 1996; Kita, 2000, 2010). This is supported by the evidence that speakers gesticulate also when they don’t see their addressees (for example when they are speaking over the phone (Cohen & Harrison, 1973; de Ruiter, 2003). On the other hand, gesturing is intended to communicate and is part of the speaker’s communicative effort (Kendon, 2004). Gestures also play an important role in L2 development and communicative strategies (Gullberg, 1998) and it is possible that L1 gestures influence gestures during L2 language development, and that gestural transfer co-occurs with linguistic transfer (Brown & Gullberg, 2008; Pika, Nicoladis, Marentette, 2006). It has also been suggested that, since gestures play an important role in facilitating language access in speech production, bilingual/L2 speakers may gesture more than monolingual speakers. This would be due to the cognitive load deriving from speaking a different language from the native one (Kita, 2000; Krauss & Hadar, 1999). Until now, gestures and prosody have been studied mostly in their temporal interaction and coordination (McClave, 1991, Loehr, 2004) or in the possible similarities between their pragmatic functions (Tuite, 1993). Little attention has been paid to the relationship between speakers’ global pitch range and use of gestures (and gesture categories) in conditions of high cognitive effort, or when they increase their communicative effort in their speeches. In fact, there seem to be a lack of scientific evidence about the use of pitch and gestures variability as a communicative strategy of the speaker. The aim of this thesis is to investigate aspects of the role and functions of gestures and speech in L1 and L2 and how the cognitive and communicative efforts may influence speakers’ global pitch range and the use of gestures in a story telling task. The investigation consists of two experiments. The first one (presented in chapter 5) aims to examine if speakers’ increase in the communicative effort produces changes in pitch variation and in the use of different categories of gestures. The second one (presented in chapter 6) examines the effects of speakers’ decrease in cognitive effort in the production of speech and gestures. In the first experiment 8 Italian speakers, Italian learners of English (L2) and students of a Public Speaking class, were asked to read and tell a fable in English to their classmates (Italian was used only as a control and recorded only one time). One week later, the subjects were asked to repeat the task with the instruction to be as communicative as possible. The audiovisual material was analyzed with the software Praat (phonetic analysis) and Elan (gesture analysis). For speech, the aim was to examine possible changes in the speakers’ fluency and pitch variation after they received the instruction to be communicative; for gestures, the aim was to examine the variations in both speakers’ overall gesturing and representational gestures in the communicative task. The hypothesis tested was that the communicative task causes an increase in fluency in the L2 (shown through features like a higher speech rate and a decrease of disfluencies and pauses), and that the communicative task leads to an increased use of representational gestures, since they might be considered as helpful for the addressee to better understand what is said. The results of the first experiment led to the conclusion that if a person is asked to be communicative in the L2, they will probably increase both their pitch variation and the total number of gestures produced, with a significant increase in iconic and representational gestures. The results on pitch variation, though, do not exclude the possible effect of other variables, first of all of task repetition. In fact, it is possible that by repeating the task, the speakers became more confident and could be more focused on being more communicative in telling the story. The second experiment tests the effect of task repetition on L2 speech and gesture. In this experiment (chapter 6) only the cognitive facilitation and the effect of the decrease in cognitive load was considered. This time, in fact, the subjects (10 Italian students of English L2) were asked to watch a short cartoon and to tell the story, both in Italian and English, in front of a small audience while being video-recorded. The subjects were asked to repeat the task one week later in the two languages with the same modalities. The analyses followed the same procedure as the previous experiment (chapter 5). The hypotheses tested were that repetition itself could influence the communicativeness of the speakers in both the speech and the gesturing levels, with an increased fluency and a different use of gestures compared to the first narration attempt. This could be caused by the facilitation of the task in terms of cognitive load and better memorization. The results showed that with the decrease of the cognitive load, speakers reach a higher fluency in L2. In Italian, however, the speakers show no significant difference in fluency or liveliness (Hincks, 2005) and this might be due to the fact that repetition and memorization did not help their fluency as much as they did in the L2. As for gestures, the speakers employed a greater number overall gestures when speaking English, their L2. The use of gestures, though, did not change in spite of the repetition, with a greater use of discursive gestures even if representational gestures seem to be used more in the L2 than in Italian. Overall, the investigation shows that the results of the two experiments can be integrated and offer a wider picture on the use of gesture and the employment of representationality with the conscious intent to be communicative. When people are asked to tell a story in a more communicative way, their response is to increase the representationality carried in their gestures and consequently use more representational gestures, this does not occur whenever the subjects only repeat the narration with a lighter cognitive load

    The analysis of breathing and rhythm in speech

    Get PDF
    Speech rhythm can be described as the temporal patterning by which speech events, such as vocalic onsets, occur. Despite efforts to quantify and model speech rhythm across languages, it remains a scientifically enigmatic aspect of prosody. For instance, one challenge lies in determining how to best quantify and analyse speech rhythm. Techniques range from manual phonetic annotation to the automatic extraction of acoustic features. It is currently unclear how closely these differing approaches correspond to one another. Moreover, the primary means of speech rhythm research has been the analysis of the acoustic signal only. Investigations of speech rhythm may instead benefit from a range of complementary measures, including physiological recordings, such as of respiratory effort. This thesis therefore combines acoustic recording with inductive plethysmography (breath belts) to capture temporal characteristics of speech and speech breathing rhythms. The first part examines the performance of existing phonetic and algorithmic techniques for acoustic prosodic analysis in a new corpus of rhythmically diverse English and Mandarin speech. The second part addresses the need for an automatic speech breathing annotation technique by developing a novel function that is robust to the noisy plethysmography typical of spontaneous, naturalistic speech production. These methods are then applied in the following section to the analysis of English speech and speech breathing in a second, larger corpus. Finally, behavioural experiments were conducted to investigate listeners' perception of speech breathing using a novel gap detection task. The thesis establishes the feasibility, as well as limits, of automatic methods in comparison to manual annotation. In the speech breathing corpus analysis, they help show that speakers maintain a normative, yet contextually adaptive breathing style during speech. The perception experiments in turn demonstrate that listeners are sensitive to the violation of these speech breathing norms, even if unconsciously so. The thesis concludes by underscoring breathing as a necessary, yet often overlooked, component in speech rhythm planning and production
    • …
    corecore