Search CORE

47 research outputs found

Perception of Alcoholic Intoxication in Speech

Author: Schiel Florian
Publication venue
Publication date: 01/01/2011
Field of study

The ALC sub-challenge of the Interspeech Speaker State Chal-lenge (ISSC) aims at the automatic classification of speech sig-nals into intoxicated and sober speech. In this context we con-ducted a perception experiment on data derived from the same corpus to analyze the human performance on the same task. The results show that human still outperform comparable baseline results of ISSC. Female and male listeners perform on the same level, but there is strong evidence that intoxication in female voices is easier to be recognized than in male voices. Prosodic features contribute to the decision of human listeners but seem not to be dominant. In analogy to Doddington’s zoo of speaker verification we find some evidence for the existence of lambs and goats but no wolves. Index Terms: alcoholic intoxication, speech perception, forced choice, intonation, Alcohol Language Corpu

CiteSeerX

Open Access LMU

RANSAC-based training data selection for speaker state recognition

Author: Bozkurt E.
Erdem Tanju
Erdem Ç. E.
Erzin E.
Publication venue: The International Speech Communications Association
Publication date: 01/01/2011
Field of study

We present a Random Sampling Consensus (RANSAC) based training approach for the problem of speaker state recognition from spontaneous speech. Our system is trained and tested with the INTERSPEECH 2011 Speaker State Challenge corpora that includes the Intoxication and the Sleepiness Subchallenges, where each sub-challenge defines a two-class classification task. We aim to perform a RANSAC-based training data selection coupled with the Support Vector Machine (SVM) based classification to prune possible outliers, which exist in the training data. Our experimental evaluations indicate that utilization of RANSAC-based training data selection provides 66.32 % and 65.38 % unweighted average (UA) recall rate on the development and test sets for the Sleepiness Sub-challenge, respectively and a slight improvement on the Intoxicationubchallenge performance.TÜBİTAK ; Türk Teleko

eResearch@Ozyegin

Prediction of sleepiness ratings from voice by man and machine

Author: Beke A
Huckvale M
Ikushima M
Publication venue: 'The International Fiscal Association of Korea'
Publication date: 29/10/2020
Field of study

This paper looks in more detail at the Interspeech 2019 computational paralinguistics challenge on the prediction of sleepiness ratings from speech. In this challenge, teams were asked to train a regression model to predict sleepiness from samples of the Düsseldorf Sleepy Language Corpus (DSLC). This challenge was notable because the performance of all entrants was uniformly poor, with even the winning system only achieving a correlation of r=0.37. We look at whether the task itself is achievable, and whether the corpus is suited to training a machine learning system for the task. We perform a listening experiment using samples from the corpus and show that a group of human listeners can achieve a correlation of r=0.7 on this task, although this is mainly by classifying the recordings into one of three sleepiness groups. We show that the corpus, because of its construction, confounds variation with sleepiness and variation with speaker identity, and this was the reason that machine learning systems failed to perform well. We conclude that sleepiness rating prediction from voice is not an impossible task, but that good performance requires more information about sleepy speech and its variability across listeners than is available in the DSLC corpu

Crossref

UCL Discovery

Annotation and detection of conflict escalation in political debates

Author: Kim Samuel
Valente Fabio
Vinciarelli Alessandro
Publication venue
Publication date: 01/01/2013
Field of study

Conflict escalation in multi-party conversations refers to an increase in the intensity of conflict during conversations. Here we study annotation and detection of conflict escalation in broadcast political debates towards a machine-mediated conflict management system. In this regard, we label conflict escalation using crowd-sourced annotations and predict it with automatically extracted conversational and prosodic features. In particular, to annotate the conflict escalation we deploy two different strategies, i.e., indirect inference and direct assessment; the direct assessment method refers to a way that annotators watch and compare two consecutive clips during the annotation process, while the indirect inference method indicates that each clip is independently annotated with respect to the level of conflict then the level conflict escalation is inferred by comparing annotations of two consecutive clips. Empirical results with 792 pairs of consecutive clips in classifying three types of conflict escalation, i.e., escalation, de-escalation, and constant, show that labels from direct assessment yield higher classification performance (45.3% unweighted accuracy (UA)) than the one from indirect inference (39.7% UA), although the annotations from both methods are highly correlated (r�=0.74 in continuous values and 63% agreement in ternary classes)

Enlighten

The prediction of fatigue using speech as a biosignal

Author: A Vogel
G Åhsberg
J Krajewski
K Kaida
M Gillberg
M Rosekind
N Chawla
R Holte
T Åkerstedt
Publication venue: Third International Conference Statistical Speech and Language Processing
Publication date: 24/11/2015
Field of study

Automatic systems for estimating operator fatigue have application in safety-critical environments. We develop and evaluate a system to detect fatigue from speech recordings collected from speakers kept awake over a 60-hour period. A binary classification system (fatigued/not-fatigued) based on time spent awake showed good discrimination, with 80 % unweighted accuracy using raw features, and 90 % with speaker-normalized features. We describe the data collection, feature analysis, machine learning and cross-validation used in the study. Results are promising for real-world applications in domains such as aerospace, transportation and mining where operators are in regular verbal communication as part of their normal working activities

Crossref

UCL Discovery

Furnariidae species recognition using speech-related features and machine learning

Author: Albornoz Enrique
León Evelina
Sarquis Juan A.
Vignolo Leandro
Publication venue
Publication date: 22/11/2016
Field of study

The automatic classification of calling bird species is important to achieve more exhaustive environmental monitoring and to manage natural resources. Bird vocalizations allow to recognise new species, their natural history and macro-systematic relations, while automatic systems can speed up and improve all the process. In this work, we use state-of-art features designed for speech and speaker state recognition to classify 25 species of Furnariidae family. Since Furnariidae species inhabit the Litoral Paranaense region of Argentina (South America), this work could promote further research on the topic and the implementation of in-situ monitoring systems. Our analysis includes two widely-known classification techniques: random forest an support vector machines. The results are promising, near 86%, and were validated in a cross-validation scheme.Sociedad Argentina de Informática e Investigación Operativa (SADIO