Search CORE

19,596 research outputs found

Learning sound representations using trainable COPE feature extractors

Author: Petkov Nicolai
Strisciuglio Nicola
Vento Mario
Publication venue: 'Elsevier BV'
Publication date: 22/03/2019
Field of study

Sound analysis research has mainly been focused on speech and music processing. The deployed methodologies are not suitable for analysis of sounds with varying background noise, in many cases with very low signal-to-noise ratio (SNR). In this paper, we present a method for the detection of patterns of interest in audio signals. We propose novel trainable feature extractors, which we call COPE (Combination of Peaks of Energy). The structure of a COPE feature extractor is determined using a single prototype sound pattern in an automatic configuration process, which is a type of representation learning. We construct a set of COPE feature extractors, configured on a number of training patterns. Then we take their responses to build feature vectors that we use in combination with a classifier to detect and classify patterns of interest in audio signals. We carried out experiments on four public data sets: MIVIA audio events, MIVIA road events, ESC-10 and TU Dortmund data sets. The results that we achieved (recognition rate equal to 91.71% on the MIVIA audio events, 94% on the MIVIA road events, 81.25% on the ESC-10 and 94.27% on the TU Dortmund) demonstrate the effectiveness of the proposed method and are higher than the ones obtained by other existing approaches. The COPE feature extractors have high robustness to variations of SNR. Real-time performance is achieved even when the value of a large number of features is computed.Comment: Accepted for publication in Pattern Recognitio

arXiv.org e-Print Archive

Proceedings - University of Groningen

Dissertations of the University of Groningen

Continuous Interaction with a Virtual Human

Author: A Gravano
A Kendon
A Nijholt
AC Norwine
AH Anderson
AW Black
Bart van Straalen
C Goodwin
C Goodwin
CC Lee
D Heylen
D Heylen
D Neiberg
D Neiberg
D Reidsma
Daniel Neiberg
Dennis Reidsma
DT Fujimoto
E Kurtic
E Schegloff
F Eyben
G Skantze
H Sacks
H Welbergen van
H Welbergen van
Herwin van Welbergen
HH Clark
HH Clark
HH Clark
I Kok de
Iwan de Kok
J Allwood
J Edlund
J Gustafson
JB Bavelas
JB Bavelas
JC Carletta
Khiet Truong
KR Thórisson
M Heldner
M Maat ter
M Schröder
M Schröder
M Schröder
M Thiebaux
MB Walker
MF McKinneya
N Ward
N Ward
P French
PT Brady
S Benus
S Duncan Jr
S Goldwater
S Kopp
S Kopp
Sathish Chandra Pammi
T Toda
V Manusov
Publication venue: University of Amsterdam
Publication date: 01/01/2010
Field of study

Attentive Speaking and Active Listening require that a Virtual Human be capable of simultaneous perception/interpretation and production of communicative behavior. A Virtual Human should be able to signal its attitude and attention while it is listening to its interaction partner, and be able to attend to its interaction partner while it is speaking – and modify its communicative behavior on-the-fly based on what it perceives from its partner. This report presents the results of a four week summer project that was part of eNTERFACE’10. The project resulted in progress on several aspects of continuous interaction such as scheduling and interrupting multimodal behavior, automatic classification of listener responses, generation of response eliciting behavior, and models for appropriate reactions to listener responses. A pilot user study was conducted with ten participants. In addition, the project yielded a number of deliverables that are released for public access

Springer - Publisher Connector

Publications at Bielefeld University

University of Twente Research Information