3,999 research outputs found

    Modelling personality features by changing prosody in synthetic speech

    Get PDF
    This study explores how features of brand personalities can be modelled with the prosodic parameters pitch level, pitch range, articulation rate and loudness. Experiments with parametrical diphone synthesis showed that listeners rated the prosodically changed versions better than a baseline version for the dimension

    Auditory communication in domestic dogs: vocal signalling in the extended social environment of a companion animal

    Get PDF
    Domestic dogs produce a range of vocalisations, including barks, growls, and whimpers, which are shared with other canid species. The source–filter model of vocal production can be used as a theoretical and applied framework to explain how and why the acoustic properties of some vocalisations are constrained by physical characteristics of the caller, whereas others are more dynamic, influenced by transient states such as arousal or motivation. This chapter thus reviews how and why particular call types are produced to transmit specific types of information, and how such information may be perceived by receivers. As domestication is thought to have caused a divergence in the vocal behaviour of dogs as compared to the ancestral wolf, evidence of both dog–human and human–dog communication is considered. Overall, it is clear that domestic dogs have the potential to acoustically broadcast a range of information, which is available to conspecific and human receivers. Moreover, dogs are highly attentive to human speech and are able to extract speaker identity, emotional state, and even some types of semantic information

    Persuasion prosody in prosecutor’s speech: Ukrainian and english

    Get PDF
    This paper presents the research of prosodic means conveying the persuasion modality in a prosecutor’s speech in court. The material under study consists of English and Ukrainian speeches of the prosecutors (the total duration time is 16 hours). The results of the experimental material examination demonstrate common and specific characteristics of prosody components (melody, loudness, tempo, timber and sentence stress) in English and Ukrainian. Pragmatics of prosody semantics and correlation between its parameters have been proved. It has been stated that in both English and Ukrainian an utterance becomes emphatic due to the prosodic means of persuasion in a prosecutor’s speech as follows:  1) changes of tempo; 2) changes of the pitch of a voice; 3) replacements of the rising tone with the falling one and vice versa; 4) usage of complex tones; 5) use of an interrupted ascending or descending scale; 6) change of sentence stress type; 7) division of a sense group into two or more parts. The above mentioned facts enable us to conclude that: while describing the first of these aspects of typological similarity of prosody in the compared languages, the parameters of the pitch component of intonation are most informative when differentiating attitudinal ones. The specificity of interaction between prosodic and grammar means when expressing persuasion in Ukrainian and English prosecutor’s speech is caused by a degree of distinction between the grammatical and vocabulary systems of the compared languages

    An End-to-End Conversational Style Matching Agent

    Full text link
    We present an end-to-end voice-based conversational agent that is able to engage in naturalistic multi-turn dialogue and align with the interlocutor's conversational style. The system uses a series of deep neural network components for speech recognition, dialogue generation, prosodic analysis and speech synthesis to generate language and prosodic expression with qualities that match those of the user. We conducted a user study (N=30) in which participants talked with the agent for 15 to 20 minutes, resulting in over 8 hours of natural interaction data. Users with high consideration conversational styles reported the agent to be more trustworthy when it matched their conversational style. Whereas, users with high involvement conversational styles were indifferent. Finally, we provide design guidelines for multi-turn dialogue interactions using conversational style adaptation

    Does speech prosody matter in health communication? Evidence from native and non-native English speaking medical students in a simulated clinical interaction

    Get PDF
    The impact of the UK’s multilingual and multicultural society today can be seen in its healthcare services and have contributed towards shaping communication skills training as a core part of the UK undergraduate medical curriculum. NHS complaints statistics involving perceived staff attitudes have remained high, despite extensive communication skills training. Furthermore, foreign doctors have received a higher proportion of complaints than UK doctors. Finally, how linguistic and social factors shape the conveyance and perception of attitudes related to professionalism in medical communication remains poorly understood. The ultimate aim of this study was to ascertain if speech prosody contributes to the perception of professionalism in medical communication. Research questions on the role of speech prosody in conveying professional attitudes in medical communication, the prosodic differences between native and non-native English speaking medical students in a simulated clinical interaction, and the influence of prosodic features on listeners’ perceptions of professional attitudes were addressed. A set of acoustic parameters representing the speech prosody of native and non-native medical students in the simulated clinical setting was analysed. A perceptual experiment was then carried out to investigate the factors affecting perceived professionalism in extracts of the analysed simulated clinical interaction. The examined acoustic parameters were found to be sensitive to the English language background and the task within the simulated consultation. Interestingly, the attitudinal information associated with some of these acoustic parameters were perceived by listeners and were reflected by higher professional scale scores in the perceptual experiment, even after adjusting for the English language background. The factors of training level and consultation task also emerged to be affecting professional scale scores. Initial findings have confirmed that speech prosody plays a role in terms of contributing towards the perception of professionalism in medical communication. Incorporating how messages are delivered to patients into current models of communication skills training may have positive outcomes

    The Effect of a Voice Treatment on Facial Expression in Parkinson’s Disease: Clinical and Demographic Predictors

    Full text link
    Parkinson’s disease (PD) is a neurodegenerative disease associated with a wide range of motoric, cognitive, and behavioral symptoms. Impairments in facial mobility and emotional expressivity are common and can impair communication, in turn, affecting daily functioning and quality of life. Previous research suggests that the Lee Silverman Voice Treatment © (LSVT LOUD; Ramig et al., 2001, 2011) increases vocal loudness and facial expressivity in individuals with PD compared to PD and healthy controls. This study extends the literature by examining the effects of LSVT and an articulation-based control treatment (i.e., ARTIC) on multiple aspects of facial expressivity (i.e., emotional frequency [EF], emotional variability [EV], emotional intensity [EI], and social engagement [SE]) as well as non-emotional facial mobility (FM). Further, we examined whether demographic, clinical, cognitive, and affective variables predict facial expressivity and mobility improvement via LSVT. Participants included 40 individuals with idiopathic PD (67.5% male) and 14 demographically-matched healthy controls (60% male). The PD participants were randomly assigned to one of the following conditions: the LSVT LOUD treatment group (n = 13), a control therapy (Articulation Treatment [ARTIC]; n = 14), or an Untreated Control Condition (n = 13). All posers (PDs & HCs) were video-taped, before and after treatment (for the LSVT & ARTIC PD groups) or at baseline and after a 4-5 week waiting period for (for the Untreated PDs [UPDs] & HCs), while producing emotional (Happy, Sad, & Angry) monologues from the New York Emotion Battery (Borod et al., 1998; Borod, Welkowitz, & Obler, 1992). The monologues were randomized and divided into 15-second segments, and evaluated by 18 naïve raters for 4 different aspects of facial emotional expression and facial mobility. Separate training sessions were held for each of the five facial rating variables (i.e., FM, EF, EV, EI, & SE), and interrater reliability was largely in the high range. Findings revealed that PD posers displayed lower facial expressivity than HCs on three out of five variables, however, these effects were moderated by gender and emotion. In terms of gender, women were more expressive than men on all facial expression variables. Treatment results showed that individuals in the LSVT group showed significant improvements from pre- to post-treatment in facial expressivity for four out of the five variables examined (i.e., FM, EF, EV, & EI), however, for EV, this interaction was moderated by Gender, with significant increases from pre- to post-treatment for men but not for women in the LSVT group. There were no significant differences observed pre- to post-treatment for ARTIC or from baseline to 4-5 weeks later for the UPD and HC groups. In terms of predictive findings, demographic, clinical, cognitive, and affective variables did not predict facial improvement in LSVT participants, likely due to low power. This study has multiple clinical and research implications. First, we examined facial expression through a multifactorial approach, involving mobility, expressivity, and social judgment of others, which has not been done in other studies with PD and which may provide a better understanding of the specific facial impairments in PD. Clinically, our treatment findings for LSVT are important to the rehabilitation therapy literature, because there are very few empirically-validated treatments targeting facial emotional expressivity and facial mobility in individuals with PD

    Robust Modeling of Epistemic Mental States

    Full text link
    This work identifies and advances some research challenges in the analysis of facial features and their temporal dynamics with epistemic mental states in dyadic conversations. Epistemic states are: Agreement, Concentration, Thoughtful, Certain, and Interest. In this paper, we perform a number of statistical analyses and simulations to identify the relationship between facial features and epistemic states. Non-linear relations are found to be more prevalent, while temporal features derived from original facial features have demonstrated a strong correlation with intensity changes. Then, we propose a novel prediction framework that takes facial features and their nonlinear relation scores as input and predict different epistemic states in videos. The prediction of epistemic states is boosted when the classification of emotion changing regions such as rising, falling, or steady-state are incorporated with the temporal features. The proposed predictive models can predict the epistemic states with significantly improved accuracy: correlation coefficient (CoERR) for Agreement is 0.827, for Concentration 0.901, for Thoughtful 0.794, for Certain 0.854, and for Interest 0.913.Comment: Accepted for Publication in Multimedia Tools and Application, Special Issue: Socio-Affective Technologie
    corecore