298,709 research outputs found

    A Recurrent Encoder-Decoder Approach with Skip-filtering Connections for Monaural Singing Voice Separation

    Full text link
    The objective of deep learning methods based on encoder-decoder architectures for music source separation is to approximate either ideal time-frequency masks or spectral representations of the target music source(s). The spectral representations are then used to derive time-frequency masks. In this work we introduce a method to directly learn time-frequency masks from an observed mixture magnitude spectrum. We employ recurrent neural networks and train them using prior knowledge only for the magnitude spectrum of the target source. To assess the performance of the proposed method, we focus on the task of singing voice separation. The results from an objective evaluation show that our proposed method provides comparable results to deep learning based methods which operate over complicated signal representations. Compared to previous methods that approximate time-frequency masks, our method has increased performance of signal to distortion ratio by an average of 3.8 dB

    Kapre: On-GPU Audio Preprocessing Layers for a Quick Implementation of Deep Neural Network Models with Keras

    Get PDF
    We introduce Kapre, Keras layers for audio and music signal preprocessing. Music research using deep neural networks requires a heavy and tedious preprocessing stage, for which audio processing parameters are often ignored in parameter optimisation. To solve this problem, Kapre implements time-frequency conversions, normalisation, and data augmentation as Keras layers. We report simple benchmark results, showing real-time on-GPU preprocessing adds a reasonable amount of computation.Comment: ICML 2017 machine learning for music discover

    The characteristics and effects of motivational music in exercise settings: The possible influence of gender, age, frequency of attendance, and time of attendance

    Get PDF
    Background: The purpose of the present study was to investigate the characteristics and effects of motivational music in British gymnasia. The secondary purpose was to determine whether the characteristics and effects of motivational music were invariant in relation to gender, age, frequency of gymnasium attendance, and the time of day at which exercise participants attended gymnasia. Methods: Participants (n=532) from 29 David-Lloyd exercise facilities across Britain responded to a questionnaire that was designed to assess music preferences during exercise via two open-ended questions and one scaled-response item. Results: A content analysis of the questionnaire data yielded 45 analytic properties that were grouped into the following categories: Specific music factors, general music factors, music programme factors, delivery factors, televisual factors, personal factors, contextual factors, and psychophysical response factors. The relative incidence of these analytic properties across gender groups (male/female), age groups (16-26 yrs., 27-34 yrs., 35-45 yrs., 46+ yrs.), frequency of attendance groups (low, medium, high), and time of attendance groups (morning, afternoon, evening) was tested by use of 2 analyses. Of the personal variables tested, age exerted the greatest influence on musical preference during exercise; older participants expressed a preference for quieter, slower, and generally less overtly stimulative music. Conclusions: Music programmes that are prescribed to accompany exercise should be varied in terms of musical idiom and date of release. Such programmes will account for the preferences of different groups of exercise participants that attend gymnasia at different times of the day. Further, the music chosen should be characterised by a strong rhythmical component

    A temporally-constrained convolutive probabilistic model for pitch detection

    Get PDF
    A method for pitch detection which models the temporal evolution of musical sounds is presented in this paper. The proposed model is based on shift-invariant probabilistic latent component analysis, constrained by a hidden Markov model. The time-frequency representation of a produced musical note can be expressed by the model as a temporal sequence of spectral templates which can also be shifted over log-frequency. Thus, this approach can be effectively used for pitch detection in music signals that contain amplitude and frequency modulations. Experiments were performed using extracted sequences of spectral templates on monophonic music excerpts, where the proposed model outperforms a non-temporally constrained convolutive model for pitch detection. Finally, future directions are given for multipitch extensions of the proposed model

    Time-frequency analysis on gong timor music using short-time fourier transform and continuous wavelet transform

    Get PDF
    Time-Frequency Analysis on Gong Timor Music has an important role in the application of signal-processing music such as tone tracking and music transcription or music signal notation. Some of Gong characters is heard by different ways of forcing Gong himself, such as how to play Gong based on the Player’s senses, a set of Gong, and by changing the tempo of Gong instruments. Gong's musical signals have more complex analytical criteria than Western music instrument analysis. This research uses a Gong instrument and two notations; frequency analysis of Gong music frequency compared by the Short-time Fourier Transform (STFT), Overlap Short-time Fourier Transform (OSTFT), and Continuous Wavelet Transform (CWT) method. In the STFT and OSTFT methods, time-frequency analysis Gong music is used with different windows and hop size while CWT method uses Morlet wavelet. The results show that the CWT is better than the STFT methods
    • …
    corecore