The "bag-of-frames" approach (BOF), which encodes audio signals as the
long-term statistical distribution of short-term spectral features, is commonly
regarded as an effective and sufficient way to represent environmental sound
recordings (soundscapes) since its introduction in an influential 2007 article.
The present paper describes a concep-tual replication of this seminal article
using several new soundscape datasets, with results strongly questioning the
adequacy of the BOF approach for the task. We show that the good accuracy
originally re-ported with BOF likely result from a particularly thankful
dataset with low within-class variability, and that for more realistic
datasets, BOF in fact does not perform significantly better than a mere
one-point av-erage of the signal's features. Soundscape modeling, therefore,
may not be the closed case it was once thought to be. Progress, we ar-gue,
could lie in reconsidering the problem of considering individual acoustical
events within each soundscape

Aucouturier, Jean-Julien

Defreville, Boris

Lafay, Grégoire

Lagrange, Mathieu

English

arXiv

International audienceThe "bag-of-frames" approach (BOF), which encodes audio signals as the long-term statistical distribution of short-term spectral features, is commonly regarded as an effective and sufficient way to represent environmental sound recordings (soundscapes) since its introduction in an influential 2007 article. The present paper describes a concep-tual replication of this seminal article using several new soundscape datasets, with results strongly questioning the adequacy of the BOF approach for the task. We show that the good accuracy originally re-ported with BOF likely result from a particularly thankful dataset with low within-class variability, and that for more realistic datasets, BOF in fact does not perform significantly better than a mere one-point av-erage of the signal's features. Soundscape modeling, therefore, may not be the closed case it was once thought to be. Progress, we ar-gue, could lie in reconsidering the problem of considering individual acoustical events within each soundscape

HAL-Univ-Nantes

The bag-of-frames approach: a not so sufficient model for urban soundscapes

arXiv.org e-Print Archive

The bag-of-frames approach: a not so sufficient model for urban
  soundscapes

The bag-of-frames approach: a not so sufficient model for urban soundscapes

Abstract

Similar works

Full text

Available Versions

HAL-Univ-Nantes

arXiv.org e-Print Archive