801 research outputs found

    Audio Signal Processing Using Time-Frequency Approaches: Coding, Classification, Fingerprinting, and Watermarking

    Get PDF
    Audio signals are information rich nonstationary signals that play an important role in our day-to-day communication, perception of environment, and entertainment. Due to its non-stationary nature, time- or frequency-only approaches are inadequate in analyzing these signals. A joint time-frequency (TF) approach would be a better choice to efficiently process these signals. In this digital era, compression, intelligent indexing for content-based retrieval, classification, and protection of digital audio content are few of the areas that encapsulate a majority of the audio signal processing applications. In this paper, we present a comprehensive array of TF methodologies that successfully address applications in all of the above mentioned areas. A TF-based audio coding scheme with novel psychoacoustics model, music classification, audio classification of environmental sounds, audio fingerprinting, and audio watermarking will be presented to demonstrate the advantages of using time-frequency approaches in analyzing and extracting information from audio signals.</p

    Correlated microtiming deviations in jazz and rock music

    Full text link
    Musical rhythms performed by humans typically show temporal fluctuations. While they have been characterized in simple rhythmic tasks, it is an open question what is the nature of temporal fluctuations, when several musicians perform music jointly in all its natural complexity. To study such fluctuations in over 100 original jazz and rock/pop recordings played with and without metronome we developed a semi-automated workflow allowing the extraction of cymbal beat onsets with millisecond precision. Analyzing the inter-beat interval (IBI) time series revealed evidence for two long-range correlated processes characterized by power laws in the IBI power spectral densities. One process dominates on short timescales (t<8t < 8 beats) and reflects microtiming variability in the generation of single beats. The other dominates on longer timescales and reflects slow tempo variations. Whereas the latter did not show differences between musical genres (jazz vs. rock/pop), the process on short timescales showed higher variability for jazz recordings, indicating that jazz makes stronger use of microtiming fluctuations within a measure than rock/pop. Our results elucidate principles of rhythmic performance and can inspire algorithms for artificial music generation. By studying microtiming fluctuations in original music recordings, we bridge the gap between minimalistic tapping paradigms and expressive rhythmic performances

    The assessment and development of methods in (spatial) sound ecology

    Get PDF
    As vital ecosystems across the globe enter unchartered pressure from climate change industrial land use, understanding the processes driving ecosystem viability has never been more critical. Nuanced ecosystem understanding comes from well-collected field data and a wealth of associated interpretations. In recent years the most popular methods of ecosystem monitoring have revolutionised from often damaging and labour-intensive manual data collection to automated methods of data collection and analysis. Sound ecology describes the school of research that uses information transmitted through sound to infer properties about an area's species, biodiversity, and health. In this thesis, we explore and develop state-of-the-art automated monitoring with sound, specifically relating to data storage practice and spatial acoustic recording and data analysis. In the first chapter, we explore the necessity and methods of ecosystem monitoring, focusing on acoustic monitoring, later exploring how and why sound is recorded and the current state-of-the-art in acoustic monitoring. Chapter one concludes with us setting out the aims and overall content of the following chapters. We begin the second chapter by exploring methods used to mitigate data storage expense, a widespread issue as automated methods quickly amass vast amounts of data which can be expensive and impractical to manage. Importantly I explain how these data management practices are often used without known consequence, something I then address. Specifically, I present evidence that the most used data reduction methods (namely compression and temporal subsetting) have a surprisingly small impact on the information content of recorded sound compared to the method of analysis. This work also adds to the increasing evidence that deep learning-based methods of environmental sound quantification are more powerful and robust to experimental variation than more traditional acoustic indices. In the latter chapters, I focus on using multichannel acoustic recording for sound-source localisation. Knowing where a sound originated has a range of ecological uses, including counting individuals, locating threats, and monitoring habitat use. While an exciting application of acoustic technology, spatial acoustics has had minimal uptake owing to the expense, impracticality and inaccessibility of equipment. In my third chapter, I introduce MAARU (Multichannel Acoustic Autonomous Recording Unit), a low-cost, easy-to-use and accessible solution to this problem. I explain the software and hardware necessary for spatial recording and show how MAARU can be used to localise the direction of a sound to within ±10˚ accurately. In the fourth chapter, I explore how MAARU devices deployed in the field can be used for enhanced ecosystem monitoring by spatially clustering individuals by calling directions for more accurate abundance approximations and crude species-specific habitat usage monitoring. Most literature on spatial acoustics cites the need for many accurately synced recording devices over an area. This chapter provides the first evidence of advances made with just one recorder. Finally, I conclude this thesis by restating my aims and discussing my success in achieving them. Specifically, in the thesis’ conclusion, I reiterate the contributions made to the field as a direct result of this work and outline some possible development avenues.Open Acces

    The binned bispectrum estimator: template-based and non-parametric CMB non-Gaussianity searches

    Full text link
    We describe the details of the binned bispectrum estimator as used for the official 2013 and 2015 analyses of the temperature and polarization CMB maps from the ESA Planck satellite. The defining aspect of this estimator is the determination of a map bispectrum (3-point correlator) that has been binned in harmonic space. For a parametric determination of the non-Gaussianity in the map (the so-called fNL parameters), one takes the inner product of this binned bispectrum with theoretically motivated templates. However, as a complementary approach one can also smooth the binned bispectrum using a variable smoothing scale in order to suppress noise and make coherent features stand out above the noise. This allows one to look in a model-independent way for any statistically significant bispectral signal. This approach is useful for characterizing the bispectral shape of the galactic foreground emission, for which a theoretical prediction of the bispectral anisotropy is lacking, and for detecting a serendipitous primordial signal, for which a theoretical template has not yet been put forth. Both the template-based and the non-parametric approaches are described in this paper.Comment: Latex 42 pages with 10 figures and JCAP macros. v2: corrected small mistake in section 5.3, changed colour scale of slice figures, other minor changes and additions, matches published versio

    A Parametric Sound Object Model for Sound Texture Synthesis

    Get PDF
    This thesis deals with the analysis and synthesis of sound textures based on parametric sound objects. An overview is provided about the acoustic and perceptual principles of textural acoustic scenes, and technical challenges for analysis and synthesis are considered. Four essential processing steps for sound texture analysis are identifi ed, and existing sound texture systems are reviewed, using the four-step model as a guideline. A theoretical framework for analysis and synthesis is proposed. A parametric sound object synthesis (PSOS) model is introduced, which is able to describe individual recorded sounds through a fi xed set of parameters. The model, which applies to harmonic and noisy sounds, is an extension of spectral modeling and uses spline curves to approximate spectral envelopes, as well as the evolution of parameters over time. In contrast to standard spectral modeling techniques, this representation uses the concept of objects instead of concatenated frames, and it provides a direct mapping between sounds of diff erent length. Methods for automatic and manual conversion are shown. An evaluation is presented in which the ability of the model to encode a wide range of di fferent sounds has been examined. Although there are aspects of sounds that the model cannot accurately capture, such as polyphony and certain types of fast modulation, the results indicate that high quality synthesis can be achieved for many different acoustic phenomena, including instruments and animal vocalizations. In contrast to many other forms of sound encoding, the parametric model facilitates various techniques of machine learning and intelligent processing, including sound clustering and principal component analysis. Strengths and weaknesses of the proposed method are reviewed, and possibilities for future development are discussed

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges
    corecore