2,732 research outputs found

    Feature Learning from Spectrograms for Assessment of Personality Traits

    Full text link
    Several methods have recently been proposed to analyze speech and automatically infer the personality of the speaker. These methods often rely on prosodic and other hand crafted speech processing features extracted with off-the-shelf toolboxes. To achieve high accuracy, numerous features are typically extracted using complex and highly parameterized algorithms. In this paper, a new method based on feature learning and spectrogram analysis is proposed to simplify the feature extraction process while maintaining a high level of accuracy. The proposed method learns a dictionary of discriminant features from patches extracted in the spectrogram representations of training speech segments. Each speech segment is then encoded using the dictionary, and the resulting feature set is used to perform classification of personality traits. Experiments indicate that the proposed method achieves state-of-the-art results with a significant reduction in complexity when compared to the most recent reference methods. The number of features, and difficulties linked to the feature extraction process are greatly reduced as only one type of descriptors is used, for which the 6 parameters can be tuned automatically. In contrast, the simplest reference method uses 4 types of descriptors to which 6 functionals are applied, resulting in over 20 parameters to be tuned.Comment: 12 pages, 3 figure

    How do you say ‘hello’? Personality impressions from brief novel voices

    Get PDF
    On hearing a novel voice, listeners readily form personality impressions of that speaker. Accurate or not, these impressions are known to affect subsequent interactions; yet the underlying psychological and acoustical bases remain poorly understood. Furthermore, hitherto studies have focussed on extended speech as opposed to analysing the instantaneous impressions we obtain from first experience. In this paper, through a mass online rating experiment, 320 participants rated 64 sub-second vocal utterances of the word ‘hello’ on one of 10 personality traits. We show that: (1) personality judgements of brief utterances from unfamiliar speakers are consistent across listeners; (2) a two-dimensional ‘social voice space’ with axes mapping Valence (Trust, Likeability) and Dominance, each driven by differing combinations of vocal acoustics, adequately summarises ratings in both male and female voices; and (3) a positive combination of Valence and Dominance results in increased perceived male vocal Attractiveness, whereas perceived female vocal Attractiveness is largely controlled by increasing Valence. Results are discussed in relation to the rapid evaluation of personality and, in turn, the intent of others, as being driven by survival mechanisms via approach or avoidance behaviours. These findings provide empirical bases for predicting personality impressions from acoustical analyses of short utterances and for generating desired personality impressions in artificial voices

    A survey on perceived speaker traits: personality, likability, pathology, and the first challenge

    Get PDF
    The INTERSPEECH 2012 Speaker Trait Challenge aimed at a unified test-bed for perceived speaker traits – the first challenge of this kind: personality in the five OCEAN personality dimensions, likability of speakers, and intelligibility of pathologic speakers. In the present article, we give a brief overview of the state-of-the-art in these three fields of research and describe the three sub-challenges in terms of the challenge conditions, the baseline results provided by the organisers, and a new openSMILE feature set, which has been used for computing the baselines and which has been provided to the participants. Furthermore, we summarise the approaches and the results presented by the participants to show the various techniques that are currently applied to solve these classification tasks

    Big Data analytics to assess personality based on voice analysis

    Full text link
    Trabajo Fin de Grado en Ingeniería de Tecnologías y Servicios de TelecomunicaciónWhen humans speak, the produced series of acoustic signs do not encode only the linguistic message they wish to communicate, but also several other types of information about themselves and their states that show glimpses of their personalities and can be apprehended by judgers. As there is nowadays a trend to film job candidate’s interviews, the aim of this Thesis is to explore possible correlations between speech features extracted from interviews and personality characteristics established by experts, and to try to predict in a candidate the Big Five personality traits: Conscientiousness, Agreeableness, Neuroticism, Openness to Experience and Extraversion. The features were extracted from a genuine database of 44 women video recordings acquired in 2020, and 78 in 2019 and before from a previous study. Even though many significant correlations were found for each years’ dataset, lots of them were proven to be inconsistent through both studies. Only extraversion, and openness in a more limited way, showed a good number of clear correlations. Essentially, extraversion has been found to be related to the variation in the slope of the pitch (usually at the end of sentences), which indicates that a more "singing" voice could be associated with a higher score. In addition, spectral entropy and roll-off measurements have also been found to indicate that larger changes in the spectrum (which may also be related to more "singing" voices) could be associated with greater extraversion too. Regarding predictive modelling algorithms, aimed to estimate personality traits from the speech features obtained for the study, results were observed to be very limited in terms of accuracy and RMSE, and also through scatter plots for regression models and confusion matrixes for classification evaluation. Nevertheless, various results encourage to believe that there are some predicting capabilities, and extraversion and openness also ended up being the most predictable personality traits. Better outcomes were achieved when predictions were performed based on one specific feature instead of all of them or a reduced group, as it was the case for openness when estimated through linear and logistic regression based on time over 90% of the variation range of the deltas from the entropy of the spectrum module. Extraversion too, as it correlates well with features relating variation in F0 decreasing slope and variations in the spectrum. For the predictions, several machine learning algorithms have been used, such as linear regression, logistic regression and random forests

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    First impressions: A survey on vision-based apparent personality trait analysis

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.Peer ReviewedPostprint (author's final draft

    Emotional Prosody Processing in the Schizophrenia Spectrum.

    Get PDF
    THESIS ABSTRACT Emotional prosody processing impairment is proposed to be a main contributing factor for the formation of auditory verbal hallucinations in patients with schizophrenia. In order to evaluate such assumption, five experiments in healthy, highly schizotypal and schizophrenia populations are presented. The first part of the thesis seeks to reveal the neural underpinnings of emotional prosody comprehension (EPC) in a non-clinical population as well as the modulation of prosodic abilities by hallucination traits. By revealing the brain representation of EPC, an overlap at the neural level between EPC and auditory verbal hallucinations (AVH) was strongly suggested. By assessing the influence of hallucinatory traits on EPC abilities, a continuum in the schizophrenia spectrum in which high schizotypal population mirrors the neurocognitive profile of schizophrenia patients was established. Moreover, by studying the relation between AVH and EPC in non-clinical population, potential confounding effects of medication influencing the findings were minimized. The second part of the thesis assessed two EPC related abilities in schizophrenia patients with and without hallucinations. Firstly, voice identity recognition, a skill which relies on the analysis of some of the same acoustical features as EPC, has been evaluated in patients and controls. Finally, the last study presented in the current thesis, assessed the influence that implicit processing of emotional prosody has on selective attention in patients and controls. Both patients studies demonstrate that voice identity recognition deficits as well as abnormal modulation of selective attention by implicit emotion prosody are related to hallucinations exclusively and not to schizophrenia in general. In the final discussion, a model in which EPC deficits are a crucial factor in the formation of AVH is evaluated. Experimental findings presented in the previous chapters strongly suggests that the perception of prosodic features is impaired in patients with AVH, resulting in aberrant perception of irrelevant auditory objects with emotional prosody salience which captures the attention of the hearer and which sources (speaker identity) cannot be recognized. Such impairments may be due to structural and functional abnormalities in a network which comprises the superior temporal gyrus as a central element

    Ubiquitous emotion-aware computing

    Get PDF
    Emotions are a crucial element for personal and ubiquitous computing. What to sense and how to sense it, however, remain a challenge. This study explores the rare combination of speech, electrocardiogram, and a revised Self-Assessment Mannequin to assess people’s emotions. 40 people watched 30 International Affective Picture System pictures in either an office or a living-room environment. Additionally, their personality traits neuroticism and extroversion and demographic information (i.e., gender, nationality, and level of education) were recorded. The resulting data were analyzed using both basic emotion categories and the valence--arousal model, which enabled a comparison between both representations. The combination of heart rate variability and three speech measures (i.e., variability of the fundamental frequency of pitch (F0), intensity, and energy) explained 90% (p < .001) of the participants’ experienced valence--arousal, with 88% for valence and 99% for arousal (ps < .001). The six basic emotions could also be discriminated (p < .001), although the explained variance was much lower: 18–20%. Environment (or context), the personality trait neuroticism, and gender proved to be useful when a nuanced assessment of people’s emotions was needed. Taken together, this study provides a significant leap toward robust, generic, and ubiquitous emotion-aware computing
    corecore