Search CORE

4,742 research outputs found

Speaker segmentation and clustering

Author: Ajmera
Ajmera
Almpanidis
Barras
Bimbot
Campbell
Campbell
Cettolo
Constantine Kotropoulos
Delacourt
Deller
Fiscus
Gales
Garofolo
Godfrey
Graff
Graff
Graff
Hansen
Harb
Hess
Huang
Jain
Kim
Know
Lapidot
Lu
Manjunath
Margarita Kotti
Meignier
Oppenheim
Pellom
Reynolds
Sondhi
Tranter
Vassiliki Moschou
Ververidis
Wang
Wu
Wu
Zhou
Zhu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

This survey focuses on two challenging speech processing topics, namely: speaker segmentation and speaker clustering. Speaker segmentation aims at finding speaker change points in an audio stream, whereas speaker clustering aims at grouping speech segments based on speaker characteristics. Model-based, metric-based, and hybrid speaker segmentation algorithms are reviewed. Concerning speaker clustering, deterministic and probabilistic algorithms are examined. A comparative assessment of the reviewed algorithms is undertaken, the algorithm advantages and disadvantages are indicated, insight to the algorithms is offered, and deductions as well as recommendations are given. Rich transcription and movie analysis are candidate applications that benefit from combined speaker segmentation and clustering. © 2007 Elsevier B.V. All rights reserved

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository

Source localization and denoising: a perspective from the TDOA space

Author: Antonacci Fabio
Bestagini Paolo
Canclini Antonio
Compagnoni Marco
Sarti Augusto
Tubaro Stefano
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/03/2016
Field of study

In this manuscript, we formulate the problem of denoising Time Differences of Arrival (TDOAs) in the TDOA space, i.e. the Euclidean space spanned by TDOA measurements. The method consists of pre-processing the TDOAs with the purpose of reducing the measurement noise. The complete set of TDOAs (i.e., TDOAs computed at all microphone pairs) is known to form a redundant set, which lies on a linear subspace in the TDOA space. Noise, however, prevents TDOAs from lying exactly on this subspace. We therefore show that TDOA denoising can be seen as a projection operation that suppresses the component of the noise that is orthogonal to that linear subspace. We then generalize the projection operator also to the cases where the set of TDOAs is incomplete. We analytically show that this operator improves the localization accuracy, and we further confirm that via simulation.Comment: 25 pages, 9 figure

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Vision-Based Production of Personalized Video

Author: Chatzis S.
Doulamis A.
Doulamis N.
Kosmopoulos D.I.
Makris A.
Middleton S.E.
Publication venue
Publication date: 01/01/2008
Field of study

In this paper we present a novel vision-based system for the automated production of personalised video souvenirs for visitors in leisure and cultural heritage venues. Visitors are visually identified and tracked through a camera network. The system produces a personalized DVD souvenir at the end of a visitor’s stay allowing visitors to relive their experiences. We analyze how we identify visitors by fusing facial and body features, how we track visitors, how the tracker recovers from failures due to occlusions, as well as how we annotate and compile the final product. Our experiments demonstrate the feasibility of the proposed approach

CiteSeerX

Southampton (e-Prints Soton)

DSpace at NTUA

Acoustic Scene Classification

Author: Barchiesi D
Giannoulis D
Plumbley MD
Stowell D
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/11/2014
Field of study

This work was supported by the Centre for Digital Music Platform (grant EP/K009559/1) and a Leadership Fellowship (EP/G007144/1) both from the United Kingdom Engineering and Physical Sciences Research Council

arXiv.org e-Print Archive

Crossref

University of Surrey

Queen Mary Research Online

Surrey Research Insight