Search CORE

4,649 research outputs found

Random Regression Forests for Acoustic Event Detection and Classification

Author: Alfred Mertins
Huy Phan
Marco Maaß
Radoslaw Mazur
Senior Member
Student Member
Student Member
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/11/2014
Field of study

Despite the success of the automatic speech recognition framework in its own application field, its adaptation to the problem of acoustic event detection has resulted in limited success. In this paper, instead of treating the problem similar to the segmentation and classification tasks in speech recognition, we pose it as a regression task and propose an approach based on random forest regression. Furthermore, event localization in time can be efficiently handled as a joint problem. We first decompose the training audio signals into multiple interleaved superframes which are annotated with the corresponding event class labels and their displacements to the temporal onsets and offsets of the events. For a specific event category, a random-forest regression model is learned using the displacement information. Given an unseen superframe, the learned regressor will output the continuous estimates of the onset and offset locations of the events. To deal with multiple event categories, prior to the category-specific regression phase, a superframe-wise recognition phase is performed to reject the background superframes and to classify the event superframes into different event categories. While jointly posing event detection and localization as a regression problem is novel, the superior performance on two databases ITC-Irst and UPC-TALP demonstrates the efficiency and potential of the proposed approach

CiteSeerX

Kent Academic Repository

Queen Mary Research Online

Acoustic event detection for multiple overlapping similar sources

Author: Clayton David
Stowell Dan
Publication venue
Publication date: 09/07/2015
Field of study

Many current paradigms for acoustic event detection (AED) are not adapted to the organic variability of natural sounds, and/or they assume a limit on the number of simultaneous sources: often only one source, or one source of each type, may be active. These aspects are highly undesirable for applications such as bird population monitoring. We introduce a simple method modelling the onsets, durations and offsets of acoustic events to avoid intrinsic limits on polyphony or on inter-event temporal patterns. We evaluate the method in a case study with over 3000 zebra finch calls. In comparison against a HMM-based method we find it more accurate at recovering acoustic events, and more robust for estimating calling rates.Comment: Accepted for WASPAA 201

arXiv.org e-Print Archive

Crossref

Classification of Southern Ocean krill and icefish echoes using random forests

Author: Fallon Niall G.
Fernandes Paul G.
Fielding Sophie
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/09/2016
Field of study

Acknowledgements The authors thank the crews, fishers, and scientists who conducted the various surveys from which data were obtained. This work was supported by the Government of South Georgia and South Sandwich Islands. Additional logistical support provided by The South Atlantic Environmental Research Institute, with thanks to Paul Brickle. PF receives funding from the MASTS pooling initiative (TheMarine Alliance for Science and Technology for Scotland), and their support is gratefully acknowledged. MASTS is funded by the Scottish Funding Council (grant reference HR09011) and contributing institutions. SF is funded by the Natural Environment Research Council, and data were provided from the British Antarctic Survey Ecosystems Long-term Monitoring and Surveys programme as part of the BAS Polar Science for Planet Earth Programme. The authors also thank the anonymous referees for their helpful suggestions on an earlier version of this manuscript.Peer reviewedPostprin

Aberdeen University Research

Heriot Watt Pure

NERC Open Research Archive

Eventness: Object Detection on Spectrograms for Temporal Localization of Audio Events

Author: Das Samarjit
Li Juncheng
Pham Phuong
Szurley Joseph
Publication venue
Publication date: 19/02/2018
Field of study

In this paper, we introduce the concept of Eventness for audio event detection, which can, in part, be thought of as an analogue to Objectness from computer vision. The key observation behind the eventness concept is that audio events reveal themselves as 2-dimensional time-frequency patterns with specific textures and geometric structures in spectrograms. These time-frequency patterns can then be viewed analogously to objects occurring in natural images (with the exception that scaling and rotation invariance properties do not apply). With this key observation in mind, we pose the problem of detecting monophonic or polyphonic audio events as an equivalent visual object(s) detection problem under partial occlusion and clutter in spectrograms. We adapt a state-of-the-art visual object detection model to evaluate the audio event detection task on publicly available datasets. The proposed network has comparable results with a state-of-the-art baseline and is more robust on minority events. Provided large-scale datasets, we hope that our proposed conceptual model of eventness will be beneficial to the audio signal processing community towards improving performance of audio event detection.Comment: 5 pages, 3 figures, accepted to ICASSP 201

arXiv.org e-Print Archive

Crossref

Characterization of Ambient Noise

Author: Ramirez Rachel C.
Publication venue: AFIT Scholar
Publication date: 22/03/2018
Field of study

An Air Force sponsor is interested in improving an acoustic detection model by providing better estimates on how to characterize the background noise of various environments. This would inform decision makers on the probability of acoustic detection of different systems of interest given different levels of noise. Data mining and statistical learning techniques are applied to a National Park Service acoustic summary data set to find overall trends over varying environments. Linear regression, conditional inference trees, and random forest techniques are discussed. Findings indicate only sixteen geospatial variables at different resolutions are necessary to characterize the first ten ⅓ octave band frequencies of the L90 band using just the linear regression. The accuracy of the regression model is within 2 to 6 decibels and depends on the frequency of interest. This research is the first of its kind to apply multiple linear regression and a conditional inference tree to the national park service acoustic dataset for insights on predicting noise levels with dramatically less variables than needed in random forest algorithms. Recommended next steps are to supplement the national park service dataset with more geographic information system variables in common global databases, not unique to the United States

AFTI Scholar (Air Force Institute of Technology)