2,732 research outputs found
Feature Learning from Spectrograms for Assessment of Personality Traits
Several methods have recently been proposed to analyze speech and
automatically infer the personality of the speaker. These methods often rely on
prosodic and other hand crafted speech processing features extracted with
off-the-shelf toolboxes. To achieve high accuracy, numerous features are
typically extracted using complex and highly parameterized algorithms. In this
paper, a new method based on feature learning and spectrogram analysis is
proposed to simplify the feature extraction process while maintaining a high
level of accuracy. The proposed method learns a dictionary of discriminant
features from patches extracted in the spectrogram representations of training
speech segments. Each speech segment is then encoded using the dictionary, and
the resulting feature set is used to perform classification of personality
traits. Experiments indicate that the proposed method achieves state-of-the-art
results with a significant reduction in complexity when compared to the most
recent reference methods. The number of features, and difficulties linked to
the feature extraction process are greatly reduced as only one type of
descriptors is used, for which the 6 parameters can be tuned automatically. In
contrast, the simplest reference method uses 4 types of descriptors to which 6
functionals are applied, resulting in over 20 parameters to be tuned.Comment: 12 pages, 3 figure
How do you say ‘hello’? Personality impressions from brief novel voices
On hearing a novel voice, listeners readily form personality impressions of that speaker. Accurate or not, these impressions are known to affect subsequent interactions; yet the underlying psychological and acoustical bases remain poorly understood. Furthermore, hitherto studies have focussed on extended speech as opposed to analysing the instantaneous impressions we obtain from first experience. In this paper, through a mass online rating experiment, 320 participants rated 64 sub-second vocal utterances of the word ‘hello’ on one of 10 personality traits. We show that: (1) personality judgements of brief utterances from unfamiliar speakers are consistent across listeners; (2) a two-dimensional ‘social voice space’ with axes mapping Valence (Trust, Likeability) and Dominance, each driven by differing combinations of vocal acoustics, adequately summarises ratings in both male and female voices; and (3) a positive combination of Valence and Dominance results in increased perceived male vocal Attractiveness, whereas perceived female vocal Attractiveness is largely controlled by increasing Valence. Results are discussed in relation to the rapid evaluation of personality and, in turn, the intent of others, as being driven by survival mechanisms via approach or avoidance behaviours. These findings provide empirical bases for predicting personality impressions from acoustical analyses of short utterances and for generating desired personality impressions in artificial voices
A survey on perceived speaker traits: personality, likability, pathology, and the first challenge
The INTERSPEECH 2012 Speaker Trait Challenge aimed at a unified test-bed for perceived speaker traits – the first challenge of this kind: personality in the five OCEAN personality dimensions, likability of speakers, and intelligibility of pathologic speakers. In the present article, we give a brief overview of the state-of-the-art in these three fields of research and describe the three sub-challenges in terms of the challenge conditions, the baseline results provided by the organisers, and a new openSMILE feature set, which has been used for computing the baselines and which has been provided to the participants. Furthermore, we summarise the approaches and the results presented by the participants to show the various techniques that are currently applied to solve these classification tasks
Big Data analytics to assess personality based on voice analysis
Trabajo Fin de Grado en Ingeniería de Tecnologías y Servicios de
TelecomunicaciónWhen humans speak, the produced series of acoustic signs do not encode only the
linguistic message they wish to communicate, but also several other types of information
about themselves and their states that show glimpses of their personalities and can be
apprehended by judgers. As there is nowadays a trend to film job candidate’s interviews, the
aim of this Thesis is to explore possible correlations between speech features extracted from
interviews and personality characteristics established by experts, and to try to predict in a
candidate the Big Five personality traits: Conscientiousness, Agreeableness, Neuroticism,
Openness to Experience and Extraversion. The features were extracted from a genuine
database of 44 women video recordings acquired in 2020, and 78 in 2019 and before from a
previous study.
Even though many significant correlations were found for each years’ dataset, lots of
them were proven to be inconsistent through both studies. Only extraversion, and openness
in a more limited way, showed a good number of clear correlations. Essentially, extraversion
has been found to be related to the variation in the slope of the pitch (usually at the end of
sentences), which indicates that a more "singing" voice could be associated with a higher
score. In addition, spectral entropy and roll-off measurements have also been found to
indicate that larger changes in the spectrum (which may also be related to more "singing"
voices) could be associated with greater extraversion too.
Regarding predictive modelling algorithms, aimed to estimate personality traits from the
speech features obtained for the study, results were observed to be very limited in terms of
accuracy and RMSE, and also through scatter plots for regression models and confusion
matrixes for classification evaluation. Nevertheless, various results encourage to believe that
there are some predicting capabilities, and extraversion and openness also ended up being
the most predictable personality traits. Better outcomes were achieved when predictions
were performed based on one specific feature instead of all of them or a reduced group, as it
was the case for openness when estimated through linear and logistic regression based on
time over 90% of the variation range of the deltas from the entropy of the spectrum module.
Extraversion too, as it correlates well with features relating variation in F0 decreasing slope
and variations in the spectrum. For the predictions, several machine learning algorithms have
been used, such as linear regression, logistic regression and random forests
Models and Analysis of Vocal Emissions for Biomedical Applications
The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy
First impressions: A survey on vision-based apparent personality trait analysis
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.Peer ReviewedPostprint (author's final draft
Emotional Prosody Processing in the Schizophrenia Spectrum.
THESIS ABSTRACT
Emotional prosody processing impairment is proposed to be a main contributing factor for the formation of auditory verbal hallucinations in patients with schizophrenia. In order to evaluate such assumption, five experiments in healthy, highly schizotypal and schizophrenia populations are presented. The first part of the thesis seeks to reveal the neural underpinnings of emotional prosody comprehension (EPC) in a non-clinical population as well as the modulation of prosodic abilities by hallucination traits. By revealing the brain representation of EPC, an overlap at the neural level between EPC and auditory verbal hallucinations (AVH) was strongly suggested. By assessing the influence of hallucinatory traits on EPC abilities, a continuum in the schizophrenia spectrum in which high schizotypal population mirrors the neurocognitive profile of schizophrenia patients was established. Moreover, by studying the relation between AVH and EPC in non-clinical population, potential confounding effects of medication influencing the findings were minimized. The second part of the thesis assessed two EPC related abilities in schizophrenia patients with and without hallucinations. Firstly, voice identity recognition, a skill which relies on the analysis of some of the same acoustical features as EPC, has been evaluated in patients and controls. Finally, the last study presented in the current thesis, assessed the influence that implicit processing of emotional prosody has on selective attention in patients and controls. Both patients studies demonstrate that voice identity recognition deficits as well as abnormal modulation of selective attention by implicit emotion prosody are related to hallucinations exclusively and not to schizophrenia in general. In the final discussion, a model in which EPC deficits are a crucial factor in the formation of AVH is evaluated. Experimental findings presented in the previous chapters strongly suggests that the perception of prosodic features is impaired in patients with AVH, resulting in aberrant perception of irrelevant auditory objects with emotional prosody salience which captures the attention of the hearer and which sources (speaker identity) cannot be recognized. Such impairments may be due to structural and functional abnormalities in a network which comprises the superior temporal gyrus as a central element
Ubiquitous emotion-aware computing
Emotions are a crucial element for personal and ubiquitous computing. What to sense and how to sense it, however, remain a challenge. This study explores the rare combination of speech, electrocardiogram, and a revised Self-Assessment Mannequin to assess people’s emotions. 40 people watched 30 International Affective Picture System pictures in either an office or a living-room environment. Additionally, their personality traits neuroticism and extroversion and demographic information (i.e., gender, nationality, and level of education) were recorded. The resulting data were analyzed using both basic emotion categories and the valence--arousal model, which enabled a comparison between both representations. The combination of heart rate variability and three speech measures (i.e., variability of the fundamental frequency of pitch (F0), intensity, and energy) explained 90% (p < .001) of the participants’ experienced valence--arousal, with 88% for valence and 99% for arousal (ps < .001). The six basic emotions could also be discriminated (p < .001), although the explained variance was much lower: 18–20%. Environment (or context), the personality trait neuroticism, and gender proved to be useful when a nuanced assessment of people’s emotions was needed. Taken together, this study provides a significant leap toward robust, generic, and ubiquitous emotion-aware computing
Recommended from our members
Phonetic Imitation from an Individual-Difference Perspective: Subjective Attitude, Personality and “Autistic” Traits
Numerous studies have documented the phenomenon of phonetic imitation: the process by which the production patterns of an individual become more similar on some phonetic or acoustic dimension to those of her interlocutor. Though social factors have been suggested as a motivator for imitation, few studies has established a tight connection between language-external factors and a speaker’s likelihood to imitate. The present study investigated the phenomenon of phonetic imitation using a within-subject design embedded in an individual-differences framework. Participants were administered a phonetic imitation task, which included two speech production tasks separated by a perceptual learning task, and a battery of measures assessing traits associated with Autism-Spectrum Condition, working memory, and personality. To examine the effects of subjective attitude on phonetic imitation, participants were randomly assigned to four experimental conditions, where the perceived sexual orientation of the narrator (homosexual vs. heterosexual) and the outcome (positive vs. negative) of the story depicted in the exposure materials differed. The extent of phonetic imitation by an individual is significantly modulated by the story outcome, as well as by the participant’s subjective attitude toward the model talker, the participant’s personality trait of openness and the autistic-like trait associated with attention switching.</p
- …