73 research outputs found

    An Xception Residual Recurrent Neural Network for Audio Event Detection and Tagging

    Get PDF
    (Abstract to follow

    A Lyrics-matching QBH System for Interactive Environments

    Get PDF
    (Abstract to follow

    Automatic Phrase Continuation from Guitar and Bass guitar Melodies

    Get PDF
    A framework is proposed for generating interesting, musically similar variations of a given monophonic melody. The focus is on pop/rock guitar and bass guitar melodies with the aim of eventual extensions to other instruments and musical styles. It is demonstrated here how learning musical style from segmented audio data can be formulated as an unsupervised learning problem to generate a symbolic representation. A melody is first segmented into a sequence of notes using onset detection and pitch estimation. A set of hierarchical, coarse-to-fine symbolic representations of the melody is generated by clustering pitch values at multiple similarity thresholds. The variance ratio criterion is then used to select the appropriate clustering levels in the hierarchy. Note onsets are aligned with beats, considering the estimated meter of the melody, to create a sequence of symbols that represent the rhythm in terms of onsets/rests and the metrical locations of their occurrence. A joint representation based on the cross-product of the pitch cluster indices and metrical locations is used to train the prediction model, a variable-length Markov chain. The melodies generated by the model were evaluated through a questionnaire by a group of experts, and received an overall positive response. </jats:p

    Utilizing Domain Knowledge in End-to-End Audio Processing

    Full text link
    End-to-end neural network based approaches to audio modelling are generally outperformed by models trained on high-level data representations. In this paper we present preliminary work that shows the feasibility of training the first layers of a deep convolutional neural network (CNN) model to learn the commonly-used log-scaled mel-spectrogram transformation. Secondly, we demonstrate that upon initializing the first layers of an end-to-end CNN classifier with the learned transformation, convergence and performance on the ESC-50 environmental sound classification dataset are similar to a CNN-based model trained on the highly pre-processed log-scaled mel-spectrogram features.Comment: Accepted at the ML4Audio workshop at the NIPS 201

    Deep Learning for Audio Signal Processing

    Full text link
    Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

    Applying a learning design methodology in the flipped classroom approach – empowering teachers to reflect and design for learning

    Get PDF
    One of the recent developments in teaching that heavily relies on current technology is the “flipped classroom” approach. In a flipped classroom the traditional lecture and homework sessions are inverted. Students are provided with online material in order to gain necessary knowledge before class, while class time is devoted to clarifications and application of this knowledge. The hypothesis is that there could be deep and creative discussions when teacher and students physically meet. This paper discusses how the learning design methodology can be applied to represent, share and guide educators through flipped classroom designs. In order to discuss the opportunities arising by this approach, the different components of the Learning Design – Conceptual Map (LD-CM) are presented and examined in the context of the flipped classroom. It is shown that viewing the flipped classroom through the lens of learning design can promote the use of theories and methods to evaluate its effect on the achievement of learning objectives, and that it may draw attention to the employment of methods to gather learner responses. Moreover, a learning design approach can enforce the detailed description of activities, tools and resources used in specific flipped classroom models, and it can make educators more aware of the decisions that have to be taken and people who have to be involved when designing a flipped classroom. By using the LD-CM, this paper also draws attention to the importance of characteristics and values of different stakeholders (i.e. institutions, educators, learners, and external agents), which influence the design and success of flipped classrooms. Moreover, it looks at the teaching cycle from a flipped instruction model perspective and adjusts it to cater for the reflection loops educators are involved when designing, implementing and re-designing a flipped classroom. Finally, it highlights the effect of learning design on the guidance, representation and sharing of flipped designs and how such an effect can move forward research on the flipped classroom
    • …
    corecore