257 research outputs found

    Basic Filters for Convolutional Neural Networks Applied to Music: Training or Design?

    Full text link
    When convolutional neural networks are used to tackle learning problems based on music or, more generally, time series data, raw one-dimensional data are commonly pre-processed to obtain spectrogram or mel-spectrogram coefficients, which are then used as input to the actual neural network. In this contribution, we investigate, both theoretically and experimentally, the influence of this pre-processing step on the network's performance and pose the question, whether replacing it by applying adaptive or learned filters directly to the raw data, can improve learning success. The theoretical results show that approximately reproducing mel-spectrogram coefficients by applying adaptive filters and subsequent time-averaging is in principle possible. We also conducted extensive experimental work on the task of singing voice detection in music. The results of these experiments show that for classification based on Convolutional Neural Networks the features obtained from adaptive filter banks followed by time-averaging perform better than the canonical Fourier-transform-based mel-spectrogram coefficients. Alternative adaptive approaches with center frequencies or time-averaging lengths learned from training data perform equally well.Comment: Completely revised version; 21 pages, 4 figure

    Time-frequency shift-tolerance and counterpropagation network with applications to phoneme recognition

    Get PDF
    Human speech signals are inherently multi-component non-stationary signals. Recognition schemes for classification of non-stationary signals generally require some kind of temporal alignment to be performed. Examples of techniques used for temporal alignment include hidden Markov models and dynamic time warping. Attempts to incorporate temporal alignment into artificial neural networks have resulted in the construction of time-delay neural networks. The nonstationary nature of speech requires a signal representation that is dependent on time. Time-frequency signal analysis is an extension of conventional time-domain and frequency-domain analysis methods. Researchers have reported on the effectiveness of time-frequency representations to reveal the time-varying nature of speech. In this thesis, a recognition scheme is developed for temporal-spectral alignment of nonstationary signals by performing preprocessing on the time-frequency distributions of the speech phonemes. The resulting representation is independent of any amount of time-frequency shift and is time-frequency shift-tolerant (TFST). The proposed scheme does not require time alignment of the signals and has the additional merit of providing spectral alignment, which may have importance in recognition of speech from different speakers. A modification to the counterpropagation network is proposed that is suitable for phoneme recognition. The modified network maintains the simplicity and competitive mechanism of the counterpropagation network and has additional benefits of fast learning and good modelling accuracy. The temporal-spectral alignment recognition scheme and modified counterpropagation network are applied to the recognition task of speech phonemes. Simulations show that the proposed scheme has potential in the classification of speech phonemes which have not been aligned in time. To facilitate the research, an environment to perform time-frequency signal analysis and recognition using artificial neural networks was developed. The environment provides tools for time-frequency signal analysis and simulations of of the counterpropagation network

    Sparse Nonstationary Gabor Expansions - with Applications to Music Signals

    Get PDF

    Motor Vibration Analysis for the Fault Diagnosis in Nonstationary Operating Conditions

    Get PDF
    The reliability and performance of a system with minimum life-cycle cost have become quite prominent in engineering systems. With increasing industrial applications, machines are operating in intricate conditions with higher uncertainty, causing greater vulnerability of system failure. This paper reports fault-related information of Brushless DC Motor (BLDC motor) in nonstationary operating conditions and presents several analyses to diagnose the faults. Fault diagnosis is the most crucial and important part of system prognostics which helps to increase the remaining useful life (RUL) and prevent catastrophic failures. Having both electrical and mechanical characteristics present in a BLDC motor, it shows several faults in different operating conditions. These faults cause a significant change in the vibration of the Motor. This paper deals with the anomaly detection of BLDC motor in nonstationary speed conditions using vibration signal analysis as well as extraction of several Condition Indicators (CI)
    • …
    corecore