3 research outputs found
Anomalous Sound Detection using unsupervised and semi-supervised autoencoders and gammatone audio representation
Anomalous sound detection (ASD) is, nowadays, one of the topical subjects in
machine listening discipline. Unsupervised detection is attracting a lot of
interest due to its immediate applicability in many fields. For example,
related to industrial processes, the early detection of malfunctions or damage
in machines can mean great savings and an improvement in the efficiency of
industrial processes. This problem can be solved with an unsupervised ASD
solution since industrial machines will not be damaged simply by having this
audio data in the training stage. This paper proposes a novel framework based
on convolutional autoencoders (both unsupervised and semi-supervised) and a
Gammatone-based representation of the audio. The results obtained by these
architectures substantially exceed the results presented as a baseline.Comment: Submitted to DCASE2020 Workshop, Workshop on Detection and
Classification of Acoustic Scenes and Event
Unsupervised Detection of Anomalous Sound based on Deep Learning and the Neyman-Pearson Lemma
This paper proposes a novel optimization principle and its implementation for
unsupervised anomaly detection in sound (ADS) using an autoencoder (AE). The
goal of unsupervised-ADS is to detect unknown anomalous sound without training
data of anomalous sound. Use of an AE as a normal model is a state-of-the-art
technique for unsupervised-ADS. To decrease the false positive rate (FPR), the
AE is trained to minimize the reconstruction error of normal sounds and the
anomaly score is calculated as the reconstruction error of the observed sound.
Unfortunately, since this training procedure does not take into account the
anomaly score for anomalous sounds, the true positive rate (TPR) does not
necessarily increase. In this study, we define an objective function based on
the Neyman-Pearson lemma by considering ADS as a statistical hypothesis test.
The proposed objective function trains the AE to maximize the TPR under an
arbitrary low FPR condition. To calculate the TPR in the objective function, we
consider that the set of anomalous sounds is the complementary set of normal
sounds and simulate anomalous sounds by using a rejection sampling algorithm.
Through experiments using synthetic data, we found that the proposed method
improved the performance measures of ADS under low FPR conditions. In addition,
we confirmed that the proposed method could detect anomalous sounds in real
environments.Comment: IEEE/ACM Transactions on Audio, Speech, and Language Processing, 201
Inspection of Visible and Invisible Features of Objects with Image and Sound Signal Processing
Abstract β In this paper, we propose a new method that can inspect visual and non-visual features of objects simultaneously by using image and sound signal processing techniques. A method for discriminating a property of an object with the use of generated sound when striking it with a hammer is called a hammering test. This method can investigate non-visual features of objects such as inner structure of objects, e.g., the existence of defects and cracks inside objects. However, this method depends on human experience and skills. In addition, if we perform this test over a wide area of objects, it is required to manually record hammering positions one by one. To solve these problems, this paper proposes a hammering test system consisting of two video cameras that can acquire image and sound signals of a hammering scene. The shape of the object (visual feature) is measured by the image signal processing from the result of 3-D measurement of each hammering position, and the thickness or material (non-visual feature) is estimated by the sound signal processing in time and frequency domains. The validity of proposed method is shown through experiments. I