3,160 research outputs found
A Detailed Investigation into Low-Level Feature Detection in Spectrogram Images
Being the first stage of analysis within an image, low-level feature detection is a crucial step in the image analysis process and, as such, deserves suitable attention. This paper presents a systematic investigation into low-level feature detection in spectrogram images. The result of which is the identification of frequency tracks. Analysis of the literature identifies different strategies for accomplishing low-level feature detection. Nevertheless, the advantages and disadvantages of each are not explicitly investigated. Three model-based detection strategies are outlined, each extracting an increasing amount of information from the spectrogram, and, through ROC analysis, it is shown that at increasing levels of extraction the detection rates increase. Nevertheless, further investigation suggests that model-based detection has a limitation—it is not computationally feasible to fully evaluate the model of even a simple sinusoidal track. Therefore, alternative approaches, such as dimensionality reduction, are investigated to reduce the complex search space. It is shown that, if carefully selected, these techniques can approach the detection rates of model-based strategies that perform the same level of information extraction. The implementations used to derive the results presented within this paper are available online from http://stdetect.googlecode.com
A Framework for Bioacoustic Vocalization Analysis Using Hidden Markov Models
Using Hidden Markov Models (HMMs) as a recognition framework for automatic classification of animal vocalizations has a number of benefits, including the ability to handle duration variability through nonlinear time alignment, the ability to incorporate complex language or recognition constraints, and easy extendibility to continuous recognition and detection domains. In this work, we apply HMMs to several different species and bioacoustic tasks using generalized spectral features that can be easily adjusted across species and HMM network topologies suited to each task. This experimental work includes a simple call type classification task using one HMM per vocalization for repertoire analysis of Asian elephants, a language-constrained song recognition task using syllable models as base units for ortolan bunting vocalizations, and a stress stimulus differentiation task in poultry vocalizations using a non-sequential model via a one-state HMM with Gaussian mixtures. Results show strong performance across all tasks and illustrate the flexibility of the HMM framework for a variety of species, vocalization types, and analysis tasks
A Framework for Bioacoustic Vocalization Analysis Using Hidden Markov Models
Using Hidden Markov Models (HMMs) as a recognition framework for automatic classification of animal vocalizations has a number of benefits, including the ability to handle duration variability through nonlinear time alignment, the ability to incorporate complex language or recognition constraints, and easy extendibility to continuous recognition and detection domains. In this work, we apply HMMs to several different species and bioacoustic tasks using generalized spectral features that can be easily adjusted across species and HMM network topologies suited to each task. This experimental work includes a simple call type classification task using one HMM per vocalization for repertoire analysis of Asian elephants, a language-constrained song recognition task using syllable models as base units for ortolan bunting vocalizations, and a stress stimulus differentiation task in poultry vocalizations using a non-sequential model via a one-state HMM with Gaussian mixtures. Results show strong performance across all tasks and illustrate the flexibility of the HMM framework for a variety of species, vocalization types, and analysis tasks
Local Visual Microphones: Improved Sound Extraction from Silent Video
Sound waves cause small vibrations in nearby objects. A few techniques exist
in the literature that can extract sound from video. In this paper we study
local vibration patterns at different image locations. We show that different
locations in the image vibrate differently. We carefully aggregate local
vibrations and produce a sound quality that improves state-of-the-art. We show
that local vibrations could have a time delay because sound waves take time to
travel through the air. We use this phenomenon to estimate sound direction. We
also present a novel algorithm that speeds up sound extraction by two to three
orders of magnitude and reaches real-time performance in a 20KHz video.Comment: Accepted to BMVC 201
Recommended from our members
An end-to-end framework for real-time automatic sleep stage classification.
Sleep staging is a fundamental but time consuming process in any sleep laboratory. To greatly speed up sleep staging without compromising accuracy, we developed a novel framework for performing real-time automatic sleep stage classification. The client-server architecture adopted here provides an end-to-end solution for anonymizing and efficiently transporting polysomnography data from the client to the server and for receiving sleep stages in an interoperable fashion. The framework intelligently partitions the sleep staging task between the client and server in a way that multiple low-end clients can work with one server, and can be deployed both locally as well as over the cloud. The framework was tested on four datasets comprising ≈1700 polysomnography records (≈12000 hr of recordings) collected from adolescents, young, and old adults, involving healthy persons as well as those with medical conditions. We used two independent validation datasets: one comprising patients from a sleep disorders clinic and the other incorporating patients with Parkinson's disease. Using this system, an entire night's sleep was staged with an accuracy on par with expert human scorers but much faster (≈5 s compared with 30-60 min). To illustrate the utility of such real-time sleep staging, we used it to facilitate the automatic delivery of acoustic stimuli at targeted phase of slow-sleep oscillations to enhance slow-wave sleep
- …