1,079 research outputs found

    Spectrogram classification using dissimilarity space

    Get PDF
    In this work, we combine a Siamese neural network and different clustering techniques to generate a dissimilarity space that is then used to train an SVM for automated animal audio classification. The animal audio datasets used are (i) birds and (ii) cat sounds, which are freely available. We exploit different clustering methods to reduce the spectrograms in the dataset to a number of centroids that are used to generate the dissimilarity space through the Siamese network. Once computed, we use the dissimilarity space to generate a vector space representation of each pattern, which is then fed into an support vector machine (SVM) to classify a spectrogram by its dissimilarity vector. Our study shows that the proposed approach based on dissimilarity space performs well on both classification problems without ad-hoc optimization of the clustering methods. Moreover, results show that the fusion of CNN-based approaches applied to the animal audio classification problem works better than the stand-alone CNNs

    Learning An Invariant Speech Representation

    Get PDF
    Recognition of speech, and in particular the ability to generalize and learn from small sets of labelled examples like humans do, depends on an appropriate representation of the acoustic input. We formulate the problem of finding robust speech features for supervised learning with small sample complexity as a problem of learning representations of the signal that are maximally invariant to intraclass transformations and deformations. We propose an extension of a theory for unsupervised learning of invariant visual representations to the auditory domain and empirically evaluate its validity for voiced speech sound classification. Our version of the theory requires the memory-based, unsupervised storage of acoustic templates -- such as specific phones or words -- together with all the transformations of each that normally occur. A quasi-invariant representation for a speech segment can be obtained by projecting it to each template orbit, i.e., the set of transformed signals, and computing the associated one-dimensional empirical probability distributions. The computations can be performed by modules of filtering and pooling, and extended to hierarchical architectures. In this paper, we apply a single-layer, multicomponent representation for phonemes and demonstrate improved accuracy and decreased sample complexity for vowel classification compared to standard spectral, cepstral and perceptual features.Comment: CBMM Memo No. 022, 5 pages, 2 figure

    Learning sound representations using trainable COPE feature extractors

    Get PDF
    Sound analysis research has mainly been focused on speech and music processing. The deployed methodologies are not suitable for analysis of sounds with varying background noise, in many cases with very low signal-to-noise ratio (SNR). In this paper, we present a method for the detection of patterns of interest in audio signals. We propose novel trainable feature extractors, which we call COPE (Combination of Peaks of Energy). The structure of a COPE feature extractor is determined using a single prototype sound pattern in an automatic configuration process, which is a type of representation learning. We construct a set of COPE feature extractors, configured on a number of training patterns. Then we take their responses to build feature vectors that we use in combination with a classifier to detect and classify patterns of interest in audio signals. We carried out experiments on four public data sets: MIVIA audio events, MIVIA road events, ESC-10 and TU Dortmund data sets. The results that we achieved (recognition rate equal to 91.71% on the MIVIA audio events, 94% on the MIVIA road events, 81.25% on the ESC-10 and 94.27% on the TU Dortmund) demonstrate the effectiveness of the proposed method and are higher than the ones obtained by other existing approaches. The COPE feature extractors have high robustness to variations of SNR. Real-time performance is achieved even when the value of a large number of features is computed.Comment: Accepted for publication in Pattern Recognitio

    Study of radar signatures of drones equipped with threat payloads

    Get PDF
    The authors acknowledge the funding received by the Army Research Laboratory under Cooperative Agreement Number: W911NF-19-2-0075.Commercial or customised drones with the ability to carry payloads have the potential to cause security threats so the need to accurately detect and identify them with suitable sensors has increased in recent times. Radar sensors are well capable of detecting and classifying a drone by using the unique signatures produced from both the stationary and rotating parts of the target. In this study we have examined the radar signatures of drones carrying different types of payloads which simulate the following three hazardous scenarios: 1) liquid spray, 2) Inertial forces simulating a gun recoil effect, and 3) heavy payloads. The main objective was to model the radar signatures of these scenarios and analyse the characteristic signatures. Two radars, operating at 24 GHz and 94 GHz, have been used to collect data to validate the modelling. The results of the study demonstrate that the payloads produce unique radar return signals, mainly in the Doppler domain, which can be used for robust classification.Publisher PD

    Knowledge-based fault detection using time-frequency analysis

    Get PDF
    This work studies a fault detection method which analyzes sensor data for changes in their characteristics to detect the occurrence of faults in a dynamic system. The test system considered in this research is a Boeing-747 aircraft system and the faults considered are the actuator faults in the aircraft. The method is an alternative to conventional fault detection method and does not rely on analytical mathematical models but acquires knowledge about the system through experiments. In this work, we test the concept that the energy distribution of resolution than the windowed Fourier transform. Verification of the proposed methodology is carried in two parts. The first set of experiments considers entire data as a single window. Results show that the method effectively classifies the indicators by more that 85% as correct detections. The second set of experiments verifies the method for online fault detection. It is observed that the mean detection delay was less than 8 seconds. We also developed a simple graphical user interface to run the online fault detection

    High Voltage Insulation Surface Condition Analysis Using Time Frequency Distributions

    Get PDF
    In high voltage engineering, insulation is the most important part to prevent the flow of current to undesired paths. Currently, polymeric type of insulation is widely used because of its advantages which are light, easy to fabricate, and have good dielectric properties compared to traditional ceramic or non polymeric insulation. In previous researches, leakage current frequency component is mainly used to analyze surface condition of polymeric insulation and it is, normally, analyzed by using fast Fourier transform (FFT). However, the technique only presents spectral information and is not suitable for the leakage current signal that consists of magnitude and frequency variations. Thus, time-frequency analysis technique needs to be employed to provide spectral and temporal information of the signal. This research presents the analysis of leakage current using time-frequency distributions (TFDs). Time-frequency distributions (TFDs) such as spectrogram and S-transform are applied to represent the leakage current (LC) in time-frequency representation (TFR). These techniques extract relevant information from TFR include root mean square current (RMS), total harmonic distortion (THD), total non harmonic distortion (TnHD) and total current waveform distortion (TWD). Tracking and erosion test via Incline Plane Test complying with BS EN60587-2007 is conducted to collect different leakage current patterns on polymeric and non polymeric material. Furthermore, the performance of the TFDs is evaluated based on their TFRs accuracy and the results shows that S-transform outperforms spectrogram in term of frequency and time resolution. Thus, the classification of leakage current using parameters from S-transform can be implemented to determine material state and severity instantaneously
    • …
    corecore