38,580 research outputs found
A Detailed Investigation into Low-Level Feature Detection in Spectrogram Images
Being the first stage of analysis within an image, low-level feature detection is a crucial step in the image analysis process and, as such, deserves suitable attention. This paper presents a systematic investigation into low-level feature detection in spectrogram images. The result of which is the identification of frequency tracks. Analysis of the literature identifies different strategies for accomplishing low-level feature detection. Nevertheless, the advantages and disadvantages of each are not explicitly investigated. Three model-based detection strategies are outlined, each extracting an increasing amount of information from the spectrogram, and, through ROC analysis, it is shown that at increasing levels of extraction the detection rates increase. Nevertheless, further investigation suggests that model-based detection has a limitation—it is not computationally feasible to fully evaluate the model of even a simple sinusoidal track. Therefore, alternative approaches, such as dimensionality reduction, are investigated to reduce the complex search space. It is shown that, if carefully selected, these techniques can approach the detection rates of model-based strategies that perform the same level of information extraction. The implementations used to derive the results presented within this paper are available online from http://stdetect.googlecode.com
Recommended from our members
A tutorial on cue combination and Signal Detection Theory: Using changes in sensitivity to evaluate how observers integrate sensory information
Many sensory inputs contain multiple sources of information (‘cues’), such as two sounds of different frequencies, or a voice heard in unison with moving lips. Often, each cue provides a separate estimate of the same physical attribute, such as the size or location of an object. An ideal observer can exploit such redundant sensory information to improve the accuracy of their perceptual judgments. For example, if each cue is modeled as an independent, Gaussian, random variable, then combining Ncues should provide up to a √N improvement in detection/discrimination sensitivity. Alternatively, a less efficient observer may base their decision on only a subset of the available information, and so gain little or no benefit from having access to multiple sources of information. Here we use Signal Detection Theory to formulate and compare various models of cue-combination, many of which are commonly used to explain empirical data. We alert the reader to the key assumptions inherent in each model, and provide formulas for deriving quantitative predictions. Code is also provided for simulating each model, allowing expected levels of measurement error to be quantified. Based on these results, it is shown that predicted sensitivity often differs surprisingly little between qualitatively distinct models of combination. This means that sensitivity alone is not sufficient for understanding decision efficiency, and the implications of this are discussed
Robust sound event detection in bioacoustic sensor networks
Bioacoustic sensors, sometimes known as autonomous recording units (ARUs),
can record sounds of wildlife over long periods of time in scalable and
minimally invasive ways. Deriving per-species abundance estimates from these
sensors requires detection, classification, and quantification of animal
vocalizations as individual acoustic events. Yet, variability in ambient noise,
both over time and across sensors, hinders the reliability of current automated
systems for sound event detection (SED), such as convolutional neural networks
(CNN) in the time-frequency domain. In this article, we develop, benchmark, and
combine several machine listening techniques to improve the generalizability of
SED models across heterogeneous acoustic environments. As a case study, we
consider the problem of detecting avian flight calls from a ten-hour recording
of nocturnal bird migration, recorded by a network of six ARUs in the presence
of heterogeneous background noise. Starting from a CNN yielding
state-of-the-art accuracy on this task, we introduce two noise adaptation
techniques, respectively integrating short-term (60 milliseconds) and long-term
(30 minutes) context. First, we apply per-channel energy normalization (PCEN)
in the time-frequency domain, which applies short-term automatic gain control
to every subband in the mel-frequency spectrogram. Secondly, we replace the
last dense layer in the network by a context-adaptive neural network (CA-NN)
layer. Combining them yields state-of-the-art results that are unmatched by
artificial data augmentation alone. We release a pre-trained version of our
best performing system under the name of BirdVoxDetect, a ready-to-use detector
of avian flight calls in field recordings.Comment: 32 pages, in English. Submitted to PLOS ONE journal in February 2019;
revised August 2019; published October 201
Mathematical and computer modeling of electro-optic systems using a generic modeling approach
The conventional approach to modelling electro-optic sensor systems is to develop separate models for individual systems or classes of system, depending on the detector technology employed in the sensor and the application. However, this ignores commonality in design and in components of these systems. A generic approach is presented for modelling a variety of sensor systems operating in the infrared waveband that also allows systems to be modelled with different levels of detail and at different stages of the product lifecycle. The provision of different model types (parametric and image-flow descriptions) within the generic framework can allow valuable insights to be gained
- …