Search CORE

6 research outputs found

Source ambiguity resolution of overlapped sounds in a multi-microphone room environment

Author: Butko Taras
Chakraborty Rupayan
Nadeu Camprubí Climent
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

When several acoustic sources are simultaneously active in a meeting room scenario, and both the position of the sources and the identity of the time-overlapped sound classes have been estimated, the problem of assigning each source position to one of the sound classes still remains. This problem is found in the real-time system implemented in our smart-room, where it is assumed that up to two acoustic events may overlap in time and the source positions are relatively well separated in space. The position assignment system proposed in this work is based on fusion of model-based log-likelihood ratios obtained after carrying out several different partial source separations in parallel. To perform the separation, frequency-invariant null-steering beamformers, which can work with a small number of microphones, are used. The experimental results using all the six microphone arrays deployed in the room show a high assignment rate in our particular scenario.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Springer - Publisher Connector

Joint model-based recognition and localization of overlapped acoustic events using a set of distributed small microphone arrays

Author: Chakraborty Rupayan
Nadeu Climent
Publication venue
Publication date: 01/01/2017
Field of study

In the analysis of acoustic scenes, often the occurring sounds have to be detected in time, recognized, and localized in space. Usually, each of these tasks is done separately. In this paper, a model-based approach to jointly carry them out for the case of multiple simultaneous sources is presented and tested. The recognized event classes and their respective room positions are obtained with a single system that maximizes the combination of a large set of scores, each one resulting from a different acoustic event model and a different beamformer output signal, which comes from one of several arbitrarily-located small microphone arrays. By using a two-step method, the experimental work for a specific scenario consisting of meeting-room acoustic events, either isolated or overlapped with speech, is reported. Tests carried out with two datasets show the advantage of the proposed approach with respect to some usual techniques, and that the inclusion of estimated priors brings a further performance improvement.Comment: Computational acoustic scene analysis, microphone array signal processing, acoustic event detectio

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Lightweight multi-DOA tracking of mobile speech sources

Author: A Manikas
A Saxena
C Rascon
C Ris
Caleb Rascon
CD Manning
F Grondin
Gibran Fuentes
H Teutsch
Ivan Meza
J Benesty
J Huang
J-M Valin
J-M Valin
K Nakadai
K Nakamura
KV Ramachandra
L Griffiths
LA Pineda
ME Lockwood
R Liu
R Ruiz-Boullosa
R Schmidt
RE Kalman
S Mohan
Z Liang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Acoustic event detection and localization using distributed microphone arrays

Author: Chakraborty Rupayan
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2013
Field of study

Automatic acoustic scene analysis is a complex task that involves several functionalities: detection (time), localization (space), separation, recognition, etc. This thesis focuses on both acoustic event detection (AED) and acoustic source localization (ASL), when several sources may be simultaneously present in a room. In particular, the experimentation work is carried out with a meeting-room scenario. Unlike previous works that either employed models of all possible sound combinations or additionally used video signals, in this thesis, the time overlapping sound problem is tackled by exploiting the signal diversity that results from the usage of multiple microphone array beamformers. The core of this thesis work is a rather computationally efficient approach that consists of three processing stages. In the first, a set of (null) steering beamformers is used to carry out diverse partial signal separations, by using multiple arbitrarily located linear microphone arrays, each of them composed of a small number of microphones. In the second stage, each of the beamformer output goes through a classification step, which uses models for all the targeted sound classes (HMM-GMM, in the experiments). Then, in a third stage, the classifier scores, either being intra- or inter-array, are combined using a probabilistic criterion (like MAP) or a machine learning fusion technique (fuzzy integral (FI), in the experiments). The above-mentioned processing scheme is applied in this thesis to a set of complexity-increasing problems, which are defined by the assumptions made regarding identities (plus time endpoints) and/or positions of sounds. In fact, the thesis report starts with the problem of unambiguously mapping the identities to the positions, continues with AED (positions assumed) and ASL (identities assumed), and ends with the integration of AED and ASL in a single system, which does not need any assumption about identities or positions. The evaluation experiments are carried out in a meeting-room scenario, where two sources are temporally overlapped; one of them is always speech and the other is an acoustic event from a pre-defined set. Two different databases are used, one that is produced by merging signals actually recorded in the UPC¿s department smart-room, and the other consists of overlapping sound signals directly recorded in the same room and in a rather spontaneous way. From the experimental results with a single array, it can be observed that the proposed detection system performs better than either the model based system or a blind source separation based system. Moreover, the product rule based combination and the FI based fusion of the scores resulting from the multiple arrays improve the accuracies further. On the other hand, the posterior position assignment is performed with a very small error rate. Regarding ASL and assuming an accurate AED system output, the 1-source localization performance of the proposed system is slightly better than that of the widely-used SRP-PHAT system, working in an event-based mode, and it even performs significantly better than the latter one in the more complex 2-source scenario. Finally, though the joint system suffers from a slight degradation in terms of classification accuracy with respect to the case where the source positions are known, it shows the advantage of carrying out the two tasks, recognition and localization, with a single system, and it allows the inclusion of information about the prior probabilities of the source positions. It is worth noticing also that, although the acoustic scenario used for experimentation is rather limited, the approach and its formalism were developed for a general case, where the number and identities of sources are not constrained

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Secretaría de Estado de Cultura

Source ambiguity resolution of overlapped sounds in a multi-microphone room environment

Author: A Temko
A Temko
A Temko
A Waibel
AS Feng
BD Van Veen
C Nadeu
CLEAR
Climent Nadeu
D Wang
DB Ward
H Wang
J DiBiase
J Dmochowski
J Velasco
L Rabiner
LC Parra
LC Parra
LI Kuncheva
M Grabisch
M Omologo
O Hoshuyama
R Chakraborty
R Chakraborty
Rupayan Chakraborty
S Chang
S Young
SP Applebaum
T Butko
T Nishiura
Taras Butko
W Wang
X Zhuang
Y Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Source ambiguity resolution of overlapped sounds in a multi-microphone room environment

Author: Butko Taras
Chakraborty Rupayan
Nadeu Camprubí Climent
Publication venue
Publication date
Field of study

RECERCAT