137 research outputs found

    Acomodación fonética durante las interacciones conversacionales: una visión general

    Get PDF
    During conversational interactions such as tutoring, instruction-giving tasks, verbal negotiations, or just talking with friends, interlocutors’ behaviors experience a series of changes due to the characteristics of their counterpart and to the interaction itself. These changes are pervasively present in every social interaction, and most of them occur in the sounds and rhythms of our speech, which is known as acoustic-prosodic accommodation, or simply phonetic accommodation. The consequences, linguistic and social constraints, and underlying cognitive mechanisms of phonetic accommodation have been studied for at least 50 years, due to the importance of the phenomenon to several disciplines such as linguistics, psychology, and sociology. Based on the analysis and synthesis of the existing empirical research literature, in this paper we present a structured and comprehensive review of the qualities, functions, onto- and phylogenetic development, and modalities of phonetic accommodation.Durante las interacciones conversacionales como dar una tutorĂ­a, dar instrucciones, las negociaciones verbales, o simplemente hablar con amigos, los comportamientos de las personas experimentan una serie de cambios debido a las caracterĂ­sticas de su interlocutor y a la interacciĂłn en sĂ­. Estos cambios estĂĄn presentes en cada interacciĂłn social, y la mayorĂ­a de ellos ocurre en los sonidos y ritmos del habla, lo cual se conoce como acomodaciĂłn acĂșstico-prosĂłdica, o simplemente acomodaciĂłn fonĂ©tica. Las consecuencias, las limitaciones lingĂŒĂ­sticas y sociales, y los mecanismos cognitivos subyacentes a la acomodaciĂłn fonĂ©tica se han estudiado durante al menos 50 años, debido a la importancia del fenĂłmeno para varias disciplinas como la lingĂŒĂ­stica, la psicologĂ­a, y la sociologĂ­a. A partir del anĂĄlisis y sĂ­ntesis de la literatura de investigaciĂłn empĂ­rica existente, en este artĂ­culo presentamos una revisiĂłn estructurada y exhaustiva de las cualidades, funciones, desarrollo onto- y filogenĂ©tico, y modalidades de la acomodaciĂłn fonĂ©tica

    When to Say What and How: Adapting the Elaborateness and Indirectness of Spoken Dialogue Systems

    Get PDF
    With the aim of designing a spoken dialogue system which has the ability to adapt to the user's communication idiosyncrasies, we investigate whether it is possible to carry over insights from the usage of communication styles in human-human interaction to human-computer interaction. In an extensive literature review, it is demonstrated that communication styles play an important role in human communication. Using a multi-lingual data set, we show that there is a significant correlation between the communication style of the system and the preceding communication style of the user. This is why two components that extend the standard architecture of spoken dialogue systems are presented: 1) a communication style classifier that automatically identifies the user communication style and 2) a communication style selection module that selects an appropriate system communication style. We consider the communication styles elaborateness and indirectness as it has been shown that they influence the user's satisfaction and the user's perception of a dialogue. We present a neural classification approach based on supervised learning for each task. Neural networks are trained and evaluated with features that can be automatically derived during an ongoing interaction in every spoken dialogue system. It is shown that both components yield solid results and outperform the baseline in form of a majority-class classifier

    A Study of Accomodation of Prosodic and Temporal Features in Spoken Dialogues in View of Speech Technology Applications

    Get PDF
    Inter-speaker accommodation is a well-known property of human speech and human interaction in general. Broadly it refers to the behavioural patterns of two (or more) interactants and the effect of the (verbal and non-verbal) behaviour of each to that of the other(s). Implementation of thisbehavior in spoken dialogue systems is desirable as an improvement on the naturalness of humanmachine interaction. However, traditional qualitative descriptions of accommodation phenomena do not provide sufficient information for such an implementation. Therefore, a quantitativedescription of inter-speaker accommodation is required. This thesis proposes a methodology of monitoring accommodation during a human or humancomputer dialogue, which utilizes a moving average filter over sequential frames for each speaker. These frames are time-aligned across the speakers, hence the name Time Aligned Moving Average (TAMA). Analysis of spontaneous human dialogue recordings by means of the TAMA methodology reveals ubiquitous accommodation of prosodic features (pitch, intensity and speech rate) across interlocutors, and allows for statistical (time series) modeling of the behaviour, in a way which is meaningful for implementation in spoken dialogue system (SDS) environments.In addition, a novel dialogue representation is proposed that provides an additional point of view to that of TAMA in monitoring accommodation of temporal features (inter-speaker pause length and overlap frequency). This representation is a percentage turn distribution of individual speakercontributions in a dialogue frame which circumvents strict attribution of speaker-turns, by considering both interlocutors as synchronously active. Both TAMA and turn distribution metrics indicate that correlation of average pause length and overlap frequency between speakers can be attributed to accommodation (a debated issue), and point to possible improvements in SDS “turntaking” behaviour. Although the findings of the prosodic and temporal analyses can directly inform SDS implementations, further work is required in order to describe inter-speaker accommodation sufficiently, as well as to develop an adequate testing platform for evaluating the magnitude ofperceived improvement in human-machine interaction. Therefore, this thesis constitutes a first step towards a convincingly useful implementation of accommodation in spoken dialogue systems

    Phonetic accommodation of human interlocutors in the context of human-computer interaction

    Get PDF
    Phonetic accommodation refers to the phenomenon that interlocutors adapt their way of speaking to each other within an interaction. This can have a positive influence on the communication quality. As we increasingly use spoken language to interact with computers these days, the phenomenon of phonetic accommodation is also investigated in the context of human-computer interaction: on the one hand, to find out whether speakers adapt to a computer agent in a similar way as they do to a human interlocutor, on the other hand, to implement accommodation behavior in spoken dialog systems and explore how this affects their users. To date, the focus has been mainly on the global acoustic-prosodic level. The present work demonstrates that speakers interacting with a computer agent also identify locally anchored phonetic phenomena such as segmental allophonic variation and local prosodic features as accommodation targets and converge on them. To this end, we conducted two experiments. First, we applied the shadowing method, where the participants repeated short sentences from natural and synthetic model speakers. In the second experiment, we used the Wizard-of-Oz method, in which an intelligent spoken dialog system is simulated, to enable a dynamic exchange between the participants and a computer agent — the virtual language learning tutor Mirabella. The target language of our experiments was German. Phonetic convergence occurred in both experiments when natural voices were used as well as when synthetic voices were used as stimuli. Moreover, both native and non-native speakers of the target language converged to Mirabella. Thus, accommodation could be relevant, for example, in the context of computer-assisted language learning. Individual variation in accommodation behavior can be attributed in part to speaker-specific characteristics, one of which is assumed to be the personality structure. We included the Big Five personality traits as well as the concept of mental boundaries in the analysis of our data. Different personality traits influenced accommodation to different types of phonetic features. Mental boundaries have not been studied before in the context of phonetic accommodation. We created a validated German adaptation of a questionnaire that assesses the strength of mental boundaries. The latter can be used in future studies involving mental boundaries in native speakers of German.Bei phonetischer Akkommodation handelt es sich um das PhĂ€nomen, dass GesprĂ€chspartner ihre Sprechweise innerhalb einer Interaktion aneinander anpassen. Dies kann die QualitĂ€t der Kommunikation positiv beeinflussen. Da wir heutzutage immer öfter mittels gesprochener Sprache mit Computern interagieren, wird das PhĂ€nomen der phonetischen Akkommodation auch im Kontext der Mensch-Computer-Interaktion untersucht: zum einen, um herauszufinden, ob sich Sprecher an einen Computeragenten in Ă€hnlicher Weise anpassen wie an einen menschlichen GesprĂ€chspartner, zum anderen, um das Akkommodationsverhalten in Sprachdialogsysteme zu implementieren und zu erforschen, wie dieses auf ihre Benutzer wirkt. Bislang lag der Fokus dabei hauptsĂ€chlich auf der globalen akustisch-prosodischen Ebene. Die vorliegende Arbeit zeigt, dass Sprecher in Interaktion mit einem Computeragenten auch lokal verankerte phonetische PhĂ€nomene wie segmentale allophone Variation und lokale prosodische Merkmale als Akkommodationsziele identifizieren und in Bezug auf diese konvergieren. Dabei wendeten wir in einem ersten Experiment die Shadowing-Methode an, bei der die Teilnehmer kurze SĂ€tze von natĂŒrlichen und synthetischen Modellsprechern wiederholten. In einem zweiten Experiment ermöglichten wir mit der Wizard-of-Oz-Methode, bei der ein intelligentes Sprachdialogsystem simuliert wird, einen dynamischen Austausch zwischen den Teilnehmern und einem Computeragenten — der virtuellen Sprachlerntutorin Mirabella. Die Zielsprache unserer Experimente war Deutsch. Phonetische Konvergenz trat in beiden Experimenten sowohl bei Verwendung natĂŒrlicher Stimmen als auch bei Verwendung synthetischer Stimmen als Stimuli auf. Zudem konvergierten sowohl Muttersprachler als auch Nicht-Muttersprachler der Zielsprache zu Mirabella. Somit könnte Akkommodation zum Beispiel im Kontext des computergstĂŒtzten Sprachenlernens zum Tragen kommen. Individuelle Variation im Akkommodationsverhalten kann unter anderem auf sprecherspezifische Eigenschaften zurĂŒckgefĂŒhrt werden. Es wird vermutet, dass zu diesen auch die Persönlichkeitsstruktur gehört. Wir bezogen die Big Five Persönlichkeitsmerkmale sowie das Konzept der mentalen Grenzen in die Analyse unserer Daten ein. Verschiedene Persönlichkeitsmerkmale beeinflussten die Akkommodation zu unterschiedlichen Typen von phonetischen Merkmalen. Die mentalen Grenzen sind im Zusammenhang mit phonetischer Akkommodation zuvor noch nicht untersucht worden. Wir erstellten eine validierte deutsche Adaptierung eines Fragebogens, der die StĂ€rke der mentalen Grenzen erhebt. Diese kann in zukĂŒnftigen Untersuchungen mentaler Grenzen bei Muttersprachlern des Deutschen verwendet werden.Deutsche Forschungsgemeinschaft (DFG) – Projektnummer 278805297: "Phonetische Konvergenz in der Mensch-Maschine-Kommunikation

    Do (and say) as I say: Linguistic adaptation in human-computer dialogs

    Get PDF
    © Theodora Koulouri, Stanislao Lauria, and Robert D. Macredie. This article has been made available through the Brunel Open Access Publishing Fund.There is strong research evidence showing that people naturally align to each other’s vocabulary, sentence structure, and acoustic features in dialog, yet little is known about how the alignment mechanism operates in the interaction between users and computer systems let alone how it may be exploited to improve the efficiency of the interaction. This article provides an account of lexical alignment in human–computer dialogs, based on empirical data collected in a simulated human–computer interaction scenario. The results indicate that alignment is present, resulting in the gradual reduction and stabilization of the vocabulary-in-use, and that it is also reciprocal. Further, the results suggest that when system and user errors occur, the development of alignment is temporarily disrupted and users tend to introduce novel words to the dialog. The results also indicate that alignment in human–computer interaction may have a strong strategic component and is used as a resource to compensate for less optimal (visually impoverished) interaction conditions. Moreover, lower alignment is associated with less successful interaction, as measured by user perceptions. The article distills the results of the study into design recommendations for human–computer dialog systems and uses them to outline a model of dialog management that supports and exploits alignment through mechanisms for in-use adaptation of the system’s grammar and lexicon

    Investigating Automatic Measurements of Prosodic Accommodation and Its Dynamics in Social Interaction

    Get PDF
    Spoken dialogue systems are increasingly being used to facilitate and enhance human communication. While these interactive systems can process the linguistic aspects of human communication, they are not yet capable of processing the complex dynamics involved in social interaction, such as the adaptation on the part of interlocutors. Providing interactive systems with the capacity to process and exhibit this accommodation could however improve their efficiency and make machines more socially-competent interactants. At present, no automatic system is available to process prosodic accommodation, nor do any clear measures exist that quantify its dynamic manifestation. While it can be observed to be a monotonically manifest property, it is our hypotheses that it evolves dynamically with functional social aspects. In this paper, we propose an automatic system for its measurement and the capture of its dynamic manifestation. We investigate the evolution of prosodic accommodation in 41 Japanese dyadic telephone conversations and discuss its manifestation in relation to its functions in social interaction. Overall, our study shows that prosodic accommodation changes dynamically over the course of a conversation and across conversations, and that these dynamics inform about the naturalness of the conversation flow, the speakers’ degree of involvement and their affinity in the conversation

    The audio/visual mismatch and the uncanny valley: an investigation using a mismatch in the human realism of facial and vocal aspects of stimuli

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Empirical research on the uncanny valley has primarily been concerned with visual elements. The current study is intended to show how manipulating auditory variables of the stimuli affect participant’s ratings. The focus of research is to investigate whether an uncanny valley effect occurs when humans are exposed to stimuli that have an incongruity between auditory and visual aspects. Participants were exposed to sets of stimuli which are both congruent and incongruent in their levels of audio/visual humanness. Explicit measures were used to explore if a mismatch in the human realism of facial and vocal aspects produces an uncanny valley effect and attempt to explain a possible cause of this effect. Results indicate that an uncanny valley effect occurs when humans are exposed to stimuli that have an incongruity between auditory and visual aspects

    Producing Acoustic-Prosodic Entrainment in a Robotic Learning Companion to Build Learner Rapport

    Get PDF
    abstract: With advances in automatic speech recognition, spoken dialogue systems are assuming increasingly social roles. There is a growing need for these systems to be socially responsive, capable of building rapport with users. In human-human interactions, rapport is critical to patient-doctor communication, conflict resolution, educational interactions, and social engagement. Rapport between people promotes successful collaboration, motivation, and task success. Dialogue systems which can build rapport with their user may produce similar effects, personalizing interactions to create better outcomes. This dissertation focuses on how dialogue systems can build rapport utilizing acoustic-prosodic entrainment. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic features of speech, such as tone of voice or loudness, to one another over the course of a conversation. Correlated with liking and task success, a dialogue system which entrains may enhance rapport. Entrainment, however, is very challenging to model. People entrain on different features in many ways and how to design entrainment to build rapport is unclear. The first goal of this dissertation is to explore how acoustic-prosodic entrainment can be modeled to build rapport. Towards this goal, this work presents a series of studies comparing, evaluating, and iterating on the design of entrainment, motivated and informed by human-human dialogue. These models of entrainment are implemented in the dialogue system of a robotic learning companion. Learning companions are educational agents that engage students socially to increase motivation and facilitate learning. As a learning companion’s ability to be socially responsive increases, so do vital learning outcomes. A second goal of this dissertation is to explore the effects of entrainment on concrete outcomes such as learning in interactions with robotic learning companions. This dissertation results in contributions both technical and theoretical. Technical contributions include a robust and modular dialogue system capable of producing prosodic entrainment and other socially-responsive behavior. One of the first systems of its kind, the results demonstrate that an entraining, social learning companion can positively build rapport and increase learning. This dissertation provides support for exploring phenomena like entrainment to enhance factors such as rapport and learning and provides a platform with which to explore these phenomena in future work.Dissertation/ThesisDoctoral Dissertation Computer Science 201
    • 

    corecore