Sound content description is one of the aims of the MPEG-7 initiative. Although MPEG-7 focuses on indexing and retrieval of audio, there are other sound content-based processing applications waiting to be developed once we have a robust set of descriptors and structures for putting them into relation, and for expressing semantic concerns about sound. Spectral Modeling techniques provide one usable framework for extracting and organizing sound content descriptions. In this paper we will introduce one particular approach to spectral modeling, then we will present some sound descriptors that can be derived from them in order to develop sound descriptions, and we will discuss the features of a structure for organizing the information that can be derived from them (a so called "Description Scheme"). All of our current descriptors can be considered low- or mid-level, thus we will not cover the high level description of music (musical forms and styles, roles of characters in a movie, etc.) which is also relevant in MPEG-7 indeed. The descriptors proposed are the result of a sound analysis based on a spectral modeling technique, and for all of them we have devised automatic extraction procedures. The Description Scheme we present is intended to be a generic one that, based on a hierarchical (and recursive in some places) structure, can describe sound at multiple levels of detail, addressing both syntactic (structural) and semantic (content) ways for describing sound
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.