3,659 research outputs found

    A survey on perceived speaker traits: personality, likability, pathology, and the first challenge

    Get PDF
    The INTERSPEECH 2012 Speaker Trait Challenge aimed at a unified test-bed for perceived speaker traits – the first challenge of this kind: personality in the five OCEAN personality dimensions, likability of speakers, and intelligibility of pathologic speakers. In the present article, we give a brief overview of the state-of-the-art in these three fields of research and describe the three sub-challenges in terms of the challenge conditions, the baseline results provided by the organisers, and a new openSMILE feature set, which has been used for computing the baselines and which has been provided to the participants. Furthermore, we summarise the approaches and the results presented by the participants to show the various techniques that are currently applied to solve these classification tasks

    Listeners’ perceptions of the certainty and honesty of a speaker are associated with a common prosodic signature

    Get PDF
    The success of human cooperation crucially depends on mechanisms enabling individuals to detect unreliability in their conspecifics. Yet, how such epistemic vigilance is achieved from naturalistic sensory inputs remains unclear. Here we show that listeners’ perceptions of the certainty and honesty of other speakers from their speech are based on a common prosodic signature. Using a data-driven method, we separately decode the prosodic features driving listeners’ perceptions of a speaker’s certainty and honesty across pitch, duration and loudness. We find that these two kinds of judgments rely on a common prosodic signature that is perceived independently from individuals’ conceptual knowledge and native language. Finally, we show that listeners extract this prosodic signature automatically, and that this impacts the way they memorize spoken words. These findings shed light on a unique auditory adaptation that enables human listeners to quickly detect and react to unreliability during linguistic interactions

    Listeners’ perceptions of the certainty and honesty of a speaker are associated with a common prosodic signature

    Get PDF
    The success of human cooperation crucially depends on mechanisms enabling individuals to detect unreliability in their conspecifics. Yet, how such epistemic vigilance is achieved from naturalistic sensory inputs remains unclear. Here we show that listeners’ perceptions of the certainty and honesty of other speakers from their speech are based on a common prosodic signature. Using a data-driven method, we separately decode the prosodic features driving listeners’ perceptions of a speaker’s certainty and honesty across pitch, duration and loudness. We find that these two kinds of judgments rely on a common prosodic signature that is perceived independently from individuals’ conceptual knowledge and native language. Finally, we show that listeners extract this prosodic signature automatically, and that this impacts the way they memorize spoken words. These findings shed light on a unique auditory adaptation that enables human listeners to quickly detect and react to unreliability during linguistic interactions

    Feature Learning from Spectrograms for Assessment of Personality Traits

    Full text link
    Several methods have recently been proposed to analyze speech and automatically infer the personality of the speaker. These methods often rely on prosodic and other hand crafted speech processing features extracted with off-the-shelf toolboxes. To achieve high accuracy, numerous features are typically extracted using complex and highly parameterized algorithms. In this paper, a new method based on feature learning and spectrogram analysis is proposed to simplify the feature extraction process while maintaining a high level of accuracy. The proposed method learns a dictionary of discriminant features from patches extracted in the spectrogram representations of training speech segments. Each speech segment is then encoded using the dictionary, and the resulting feature set is used to perform classification of personality traits. Experiments indicate that the proposed method achieves state-of-the-art results with a significant reduction in complexity when compared to the most recent reference methods. The number of features, and difficulties linked to the feature extraction process are greatly reduced as only one type of descriptors is used, for which the 6 parameters can be tuned automatically. In contrast, the simplest reference method uses 4 types of descriptors to which 6 functionals are applied, resulting in over 20 parameters to be tuned.Comment: 12 pages, 3 figure

    Phonological issues in the production of prosody by francophone and sinophone learners of english as a second language

    Get PDF
    Un accent de non-natif peut mener Ă  une incomprĂ©hension ou Ă  la perception de degrĂ©s diffĂ©rents d'accent d'Ă©trangetĂ©. La prosodie, qui est maintenant reconnue comme un Ă©lĂ©ment important de l'impression d'Ă©trangetĂ©, est relativement peu abordĂ©e en recherche en acquisition des langues Ă©trangĂšres. Ceci contraste avec l'intĂ©rĂȘt grandissant envers la prosodie en tant qu'Ă©lĂ©ment de la langue maternelle. Dans cette thĂšse, la recherche phonologique est Ă©valuĂ©e quant Ă  sa pertinence dans la recherche sur la prosodie des langues Ă©trangĂšres. Deux aspects de la thĂ©orie phonologique sont Ă©tudiĂ©s: la typologie et l'organisation phonologique. Ce choix est justifiĂ© par la prĂ©somption gĂ©nĂ©rale que l'Ă©trangetĂ© prosodique est crĂ©Ă©e soit par une diffĂ©rence de typologie entre langue maternelle (L1) et langue Ă©trangĂšre (L2) soit par un transfert de traits prosodiques de la L1. La critique de la recherche en typologie phonologique conclut que, Ă  ce stade, aucun modĂšle de classification prosodique n'est applicable Ă  l'acquisition d'une L2. En particulier, l'Ă©tude dĂ©montre que certaines typologies, en particulier la thĂ©orie de l'isochronie accentuelle/l'isochronie syllabique de Pike, devraient ĂȘtre exclues parce qu'elles entravent les progrĂšs en recherche sur l'acquisition et la production de la prosodie des langues Ă©trangĂšres. Le second aspect de la thĂ©orie phonologique Ă©tudiĂ© dans cette thĂšse est l'organisation phonologique. La prĂ©misse est que les diffĂ©rences sous-jacentes Ă  l'organisation prosodique plutĂŽt que les diffĂ©rences phonologiques de surface sont transfĂ©rĂ©es de L1 Ă  L2. Les analyses approfondies de l'anglais nord amĂ©ricain, le français et le chinois standard rĂ©vĂšlent d'importantes diffĂ©rences phonologiques entre l'anglais nord amĂ©ricain et les deux autres langues. Quatre expĂ©riences Ă©valuent certaines de ces diffĂ©rences. La prosodie de l'anglais produite par des locuteurs natifs du français est analysĂ©e dans des phrases rythmiquement simples et des phrases rythmiquement plus complexes. Les rĂ©sultats dĂ©montrent que l'accentuation lexicale est moins problĂ©matique que l'accentuation prosodique supra-lexicale. En particulier, il est dĂ©montrĂ© que les montĂ©es de frĂ©quence fondamentale (F0) de dĂ©but et de fin de syntagme accentuel (SA), typiques du français, sont source d'erreur dans la prosodie de l'anglais langue seconde. Il est cependant montrĂ© que cette erreur, bien que remarquĂ©e par les locuteurs natifs de l'anglais, n'affecte pas la perception de placement d'accentuation par ces derniers. La prosodie de l'anglais produite par des locuteurs natifs du chinois est analysĂ©e en termes de transfert de ton et d'alignement de pic de F0. Les rĂ©sultats indiquent que les locuteurs du chinois utilisent les tons chinois quand ils produisent des tons accentuels de l'anglais; plus spĂ©cifiquement, la majoritĂ© des locuteurs utilisent le ton 2 (ton montant) quand ils produisent un ton accentuel montant. La derniĂšre expĂ©rience rĂ©vĂšle que les locuteurs natifs du chinois alignent le ton accentuel avec la syllabe accentuĂ©e Ă  laquelle elle correspond de maniĂšre plus stricte que les locuteurs natifs de l'anglais nord amĂ©ricain le font. Les rĂ©sultats de cette thĂšse gĂ©nĂšrent un aperçu de la progression de la performance de la prosodie d'une langue Ă©trangĂšre. Les conclusions comportent des implications sur le contenu pĂ©dagogique et le format de l'enseignement de la prononciation. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Phonologie, PhonĂ©tique, Phonologie prosodique, Prosodie, Rythme, ESL, Français du QuĂ©bec, Français de France, Chinois

    Phonetic accommodation of human interlocutors in the context of human-computer interaction

    Get PDF
    Phonetic accommodation refers to the phenomenon that interlocutors adapt their way of speaking to each other within an interaction. This can have a positive influence on the communication quality. As we increasingly use spoken language to interact with computers these days, the phenomenon of phonetic accommodation is also investigated in the context of human-computer interaction: on the one hand, to find out whether speakers adapt to a computer agent in a similar way as they do to a human interlocutor, on the other hand, to implement accommodation behavior in spoken dialog systems and explore how this affects their users. To date, the focus has been mainly on the global acoustic-prosodic level. The present work demonstrates that speakers interacting with a computer agent also identify locally anchored phonetic phenomena such as segmental allophonic variation and local prosodic features as accommodation targets and converge on them. To this end, we conducted two experiments. First, we applied the shadowing method, where the participants repeated short sentences from natural and synthetic model speakers. In the second experiment, we used the Wizard-of-Oz method, in which an intelligent spoken dialog system is simulated, to enable a dynamic exchange between the participants and a computer agent — the virtual language learning tutor Mirabella. The target language of our experiments was German. Phonetic convergence occurred in both experiments when natural voices were used as well as when synthetic voices were used as stimuli. Moreover, both native and non-native speakers of the target language converged to Mirabella. Thus, accommodation could be relevant, for example, in the context of computer-assisted language learning. Individual variation in accommodation behavior can be attributed in part to speaker-specific characteristics, one of which is assumed to be the personality structure. We included the Big Five personality traits as well as the concept of mental boundaries in the analysis of our data. Different personality traits influenced accommodation to different types of phonetic features. Mental boundaries have not been studied before in the context of phonetic accommodation. We created a validated German adaptation of a questionnaire that assesses the strength of mental boundaries. The latter can be used in future studies involving mental boundaries in native speakers of German.Bei phonetischer Akkommodation handelt es sich um das PhĂ€nomen, dass GesprĂ€chspartner ihre Sprechweise innerhalb einer Interaktion aneinander anpassen. Dies kann die QualitĂ€t der Kommunikation positiv beeinflussen. Da wir heutzutage immer öfter mittels gesprochener Sprache mit Computern interagieren, wird das PhĂ€nomen der phonetischen Akkommodation auch im Kontext der Mensch-Computer-Interaktion untersucht: zum einen, um herauszufinden, ob sich Sprecher an einen Computeragenten in Ă€hnlicher Weise anpassen wie an einen menschlichen GesprĂ€chspartner, zum anderen, um das Akkommodationsverhalten in Sprachdialogsysteme zu implementieren und zu erforschen, wie dieses auf ihre Benutzer wirkt. Bislang lag der Fokus dabei hauptsĂ€chlich auf der globalen akustisch-prosodischen Ebene. Die vorliegende Arbeit zeigt, dass Sprecher in Interaktion mit einem Computeragenten auch lokal verankerte phonetische PhĂ€nomene wie segmentale allophone Variation und lokale prosodische Merkmale als Akkommodationsziele identifizieren und in Bezug auf diese konvergieren. Dabei wendeten wir in einem ersten Experiment die Shadowing-Methode an, bei der die Teilnehmer kurze SĂ€tze von natĂŒrlichen und synthetischen Modellsprechern wiederholten. In einem zweiten Experiment ermöglichten wir mit der Wizard-of-Oz-Methode, bei der ein intelligentes Sprachdialogsystem simuliert wird, einen dynamischen Austausch zwischen den Teilnehmern und einem Computeragenten — der virtuellen Sprachlerntutorin Mirabella. Die Zielsprache unserer Experimente war Deutsch. Phonetische Konvergenz trat in beiden Experimenten sowohl bei Verwendung natĂŒrlicher Stimmen als auch bei Verwendung synthetischer Stimmen als Stimuli auf. Zudem konvergierten sowohl Muttersprachler als auch Nicht-Muttersprachler der Zielsprache zu Mirabella. Somit könnte Akkommodation zum Beispiel im Kontext des computergstĂŒtzten Sprachenlernens zum Tragen kommen. Individuelle Variation im Akkommodationsverhalten kann unter anderem auf sprecherspezifische Eigenschaften zurĂŒckgefĂŒhrt werden. Es wird vermutet, dass zu diesen auch die Persönlichkeitsstruktur gehört. Wir bezogen die Big Five Persönlichkeitsmerkmale sowie das Konzept der mentalen Grenzen in die Analyse unserer Daten ein. Verschiedene Persönlichkeitsmerkmale beeinflussten die Akkommodation zu unterschiedlichen Typen von phonetischen Merkmalen. Die mentalen Grenzen sind im Zusammenhang mit phonetischer Akkommodation zuvor noch nicht untersucht worden. Wir erstellten eine validierte deutsche Adaptierung eines Fragebogens, der die StĂ€rke der mentalen Grenzen erhebt. Diese kann in zukĂŒnftigen Untersuchungen mentaler Grenzen bei Muttersprachlern des Deutschen verwendet werden.Deutsche Forschungsgemeinschaft (DFG) – Projektnummer 278805297: "Phonetische Konvergenz in der Mensch-Maschine-Kommunikation

    Genre-specific persuasion in oral presentations : adaptation to the audience through multimodal persuasive strategies

    Get PDF
    Product pitches, research dissemination talks and conference presentations are three oral genres that share important characteristics. Previous literature has described them as multimodal and persuasive oral genres and has shown that speakers resort to multimodal persuasive strategies to achieve their communicative goals. However, they are used in different contexts, which is likely to affect their use of multimodal persuasion, and raises questions as to how genre-specific persuasion is. The aim of this paper is to explore how speakers adapt their multimodal persuasive efforts to the communicative situation established in each genre, and how this is reflected multimodally. This study combines multimodal discourse analysis and ethnographic methods. The results suggest that speakers multimodally convey a different relationship with the audience in each genre
    • 

    corecore