10 research outputs found

    Capturing emotions in voice: A comparative analysis of methodologies in psychology and digital signal processing

    No full text
    People use their voices to communicate not only verbally but also emotionally. This article presents theories and methodologies that concern emotional vocalizations at the intersection of psychology and digital signal processing. Specifically, it demonstrates the encoding (production) and decoding (recognition) of emotional sounds, including the review and comparison of strategies in database design, parameterization, and classification. Whereas psychology predominantly focuses on the subjective recognition of emotional vocalizations, digital signal processing relies on automated and thus more objective vocal affect measures. The article aims to compare these two approaches and suggest methods of combining them to achieve a more complex insight into the vocal communication of emotions

    Speech Emotion Recognition Based on Voice Fundamental Frequency

    No full text
    The human voice is one of the basic means of communication, thanks to which one also can easily convey the emotional state. This paper presents experiments on emotion recognition in human speech based on the fundamental frequency. AGH Emotional Speech Corpus was used. This database consists of audio samples of seven emotions acted by 12 different speakers (6 female and 6 male). We explored phrases of all the emotions – all together and in various combinations. Fast Fourier Transformation and magnitude spectrum analysis were applied to extract the fundamental tone out of the speech audio samples. After extraction of several statistical features of the fundamental frequency, we studied if they carry information on the emotional state of the speaker applying different AI methods. Analysis of the outcome data was conducted with classifiers: K-Nearest Neighbours with local induction, Random Forest, Bagging, JRip, and Random Subspace Method from algorithms collection for data mining WEKA. The results prove that the fundamental frequency is a prospective choice for further experiments

    Reading rate in filmic audio description

    Get PDF
    The study discussed in this article was carried out as a pilot study to assess the process, resources and data management scheme (Thabane et al., 2010) to be used in a large-scale experiment on filmic audio description (AD) reading rate. As part of this study we defined the reading rate in filmic AD context. We described the characteristic features of Polish filmic AD scripts and recordings and examined the reading rate of Polish AD for three Polish fiction films: a comedy, a drama, and an action film. We calculated the average length of breath pauses and the maximum, minimum and average reading rate measured in characters per second (CPS) and words per minute (WPM) – two measures commonly used in audiovisual translation. The main finding of this study is the validation of the research procedure for testing the AD reading rate. We also computed the average reading rate for Polish filmic AD (179 WPM) and discovered that it changes depending on the film genre (167 WP for drama, 182 for comedy and 189 for action). When it comes to breath pauses in Polish AD, we calculated their average length at 190 ms – a value much lower than expected for breath pauses in Polish. The results of our study are discussed in the context of research on the speech tempo

    Speech Analysis as a Tool for Detection and Monitoring of Medical Conditions : A review

    No full text
    The goal of this article is to present and compare recent approaches which use speech and voice analysis as biomarkers for screening tests and monitoring of some diseases. The article takes into account metabolic, respiratory, cardiovascular, endocrine, and nervous system disorders. A selection of articles was performed to identify studies that assess voice features quantitatively in selected disorders by acoustic and linguistic voice analysis. Information was extracted from each paper in order to compare various aspects of datasets, speech parameters, methods of applied analysis and obtained results. 110 research papers were reviewed and 47 databases were summarized. Speech analysis is a promising method for early diagnosis of certain disorders. Advanced computer voice analysis with machine learning algorithms combined with the widespread availability of smartphones allows diagnostic analysis to be conducted during the patient’s visit to the doctor or at the patient’s home during a telephone conversation. Speech analysis is a simple, low-cost, non-invasive and easy-toprovide method of medical diagnosis. These are remarkable advantages, but there are also disadvantages. The effectiveness of disease diagnoses varies from 65% up to 99%. For that reason it should be treated as a medical screening test and should be an indication of the need for classic medical tests

    How Behavioral, Photographic, and Interactional Realism Influence the Sense of Co-Presence in VR. An Investigation with Psychophysiological Measurement

    No full text
    Feeling of co-presence in VR depends on the realism of virtual agents. Our study explores how three dimensions of realism—visual appearance, behavior, and interactability—affect co-presence and Orienting Response (OR), measured using heart rate (HR) and skin conductance response (SCR). Moreover, we test whether HR and SCR can be used as measures of psychological concepts that describe virtual interactions like co-presence. Fourty-five participants passively viewed virtual characters while their HR and SCR were recorded. Afterwards participants assessed the experience of interacting with the virtual agents. The interactability of the virtual characters increased co-presence, and so did heightened appearance realism, but only when the level of behavioral realism was high. High visual and behavioral realism led to increase in SCR while visual realism alone evoked deeper HR deceleration. Nonetheless, neither SCR nor HR correlated with any psychological concepts that describe virtual interactions. In conclusion, realism can increase both the co-presence and magnitude of the OR, yet physiological indices can not reliably gauge the experience of interactions with virtual characters.</p
    corecore