29 research outputs found
Robust feature representation for classification of bird song syllables
A novel feature set for low-dimensional signal representation, designed for classification or clustering of non-stationary signals with complex variation in time and frequency, is presented. The feature representation of a signal is given by the first left and right singular vectors of its ambiguity spectrum matrix. If the ambiguity matrix is of low rank, most signal information in time direction is captured by the first right singular vector while the signal’s key frequency information is encoded by the first left singular vector. The resemblance of two signals is investigated by means of a suitable similarity assessment of the signals’ respective singular vector pair. Application of multitapers for the calculation of the ambiguity spectrum gives an increased robustness to jitter and background noise and a consequent improvement in performance, as compared to estimation based on the ordinary single Hanning window spectrogram. The suggested feature-based signal compression is applied to a syllable-based analysis of a song from the bird species Great Reed Warbler and evaluated by comparison to manual auditive and/or visual signal classification. The results show that the proposed approach outperforms well-known approaches based on mel-frequency cepstral coefficients and spectrogram cross-correlation
"Seeing Sound": Audio Classification with the Wigner-Wille Distribution and Convolutional Neural Networks
With big data becoming increasingly available, IoT hardware becoming widely
adopted, and AI capabilities becoming more powerful, organizations are
continuously investing in sensing. Data coming from sensor networks are
currently combined with sensor fusion and AI algorithms to drive innovation in
fields such as self-driving cars. Data from these sensors can be utilized in
numerous use cases, including alerts in safety systems of urban settings, for
events such as gun shots and explosions. Moreover, diverse types of sensors,
such as sound sensors, can be utilized in low-light conditions or at locations
where a camera is not available. This paper investigates the potential of the
utilization of sound-sensor data in an urban context. Technically, we propose a
novel approach of classifying sound data using the Wigner-Ville distribution
and Convolutional Neural Networks. In this paper, we report on the performance
of the approach on open-source datasets. The concept and work presented is
based on my doctoral thesis, which was performed as part of the Engineering
Doctorate program in Data Science at the University of Eindhoven, in
collaboration with the Dutch National Police. Additional work on real-world
datasets was performed during the thesis, which are not presented here due to
confidentiality
Automatic acoustic detection of birds through deep learning : the first bird audio detection challenge
Assessing the presence and abundance of birds is important for monitoring specific species as well as overall ecosystem health. Many birds are most readily detected by their sounds, and thus passive acoustic monitoring is highly appropriate. Yet acoustic monitoring is often held back by practical limitations such as the need for manual configuration, reliance on example sound libraries, low accuracy, low robustness, and limited ability to generalise to novel acoustic conditions.
Here we report outcomes from a collaborative data challenge. We present new acoustic monitoring datasets, summarise the machine learning techniques proposed by challenge teams, conduct detailed performance evaluation, and discuss how such approaches to detection can be integrated into remote monitoring projects.
Multiple methods were able to attain performance of around 88% AUC (area under the ROC curve), much higher performance than previous general‐purpose methods.
With modern machine learning including deep learning, general‐purpose acoustic bird detection can achieve very high retrieval rates in remote monitoring data ̶ with no manual recalibration, and no pre‐training of the detector for the target species or the acoustic conditions in the target environment.</ol
Predicting and auralizing acoustics in classrooms
Although classrooms have fairly simple geometries, this type of room is known to cause problems when trying to predict their acoustics using room acoustics computer modeling. Some typical features from a room acoustics point of view are: Parallel walls, low ceilings (the rooms are flat), uneven distribution of absorption, and most of the floor being covered with furniture which at long distances act as scattering elements, and at short distance provide strong specular components. The importance of diffraction and scattering is illustrated in numbers and by means of auralization, using ODEON 8 Beta