Search CORE

19 research outputs found

Recommended from our members

Sequence Classification Restricted Boltzmann Machines With Gated Units

Author: Garcez A.
Karunanithi M.
Tran S. N.
Weyde T.
Yin J.
Zhang Q.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

For the classification of sequential data, dynamic Bayesian networks and recurrent neural networks (RNNs) are the preferred models. While the former can explicitly model the temporal dependences between the variables, and the latter have the capability of learning representations. The recurrent temporal restricted Boltzmann machine (RTRBM) is a model that combines these two features. However, learning and inference in RTRBMs can be difficult because of the exponential nature of its gradient computations when maximizing log likelihoods. In this article, first, we address this intractability by optimizing a conditional rather than a joint probability distribution when performing sequence classification. This results in the ``sequence classification restricted Boltzmann machine'' (SCRBM). Second, we introduce gated SCRBMs (gSCRBMs), which use an information processing gate, as an integration of SCRBMs with long short-term memory (LSTM) models. In the experiments reported in this article, we evaluate the proposed models on optical character recognition, chunking, and multiresident activity recognition in smart homes. The experimental results show that gSCRBMs achieve the performance comparable to that of the state of the art in all three tasks. gSCRBMs require far fewer parameters in comparison with other recurrent networks with memory gates, in particular, LSTMs and gated recurrent units (GRUs)

City Research Online

University of Tasmania Open Access Repository

University of Queensland eSpace

Structural complexity in music modelling and generation with deep neural networks

Author: De Berardinis Jacopo
Publication venue
Publication date: 01/08/2022
Field of study

The University of Manchester - Institutional Repository

Emotion-Guided Music Accompaniment Generation Based on Variational Autoencoder

Author: Wang Qi
Zhang Shubing
Zhou Li
Publication venue
Publication date: 08/07/2023
Field of study

Music accompaniment generation is a crucial aspect in the composition process. Deep neural networks have made significant strides in this field, but it remains a challenge for AI to effectively incorporate human emotions to create beautiful accompaniments. Existing models struggle to effectively characterize human emotions within neural network models while composing music. To address this issue, we propose the use of an easy-to-represent emotion flow model, the Valence/Arousal Curve, which allows for the compatibility of emotional information within the model through data transformation and enhances interpretability of emotional factors by utilizing a Variational Autoencoder as the model structure. Further, we used relative self-attention to maintain the structure of the music at music phrase level and to generate a richer accompaniment when combined with the rules of music theory.Comment: Accepted By International Joint Conference on Neural Networks 2023(IJCNN2023

arXiv.org e-Print Archive

Recommended from our members

An RNN-based Music Language Model for Improving Automatic Music Transcription

Author: Benetos E.
Cherla S.
Dixon S.
Garcez A.
Sigtia S.
Weyde T.
Publication venue: International Society for Music Information Retrieval
Publication date: 01/01/2014
Field of study

In this paper, we investigate the use of Music Language Models (MLMs) for improving Automatic Music Transcription performance. The MLMs are trained on sequences of symbolic polyphonic music from the Nottingham dataset. We train Recurrent Neural Network (RNN)-based models, as they are capable of capturing complex temporal structure present in symbolic music data. Similar to the function of language models in automatic speech recognition, we use the MLMs to generate a prior probability for the occurrence of a sequence. The acoustic AMT model is based on probabilistic latent component analysis, and prior information from the MLM is incorporated into the transcription framework using Dirichlet priors. We test our hybrid models on a dataset of multiple-instrument polyphonic music and report a significant 3% improvement in terms of F-measure, when compared to using an acoustic-only model

City Research Online

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Imposing Higher-Level Structure in Polyphonic Music Generation Using Convolutional Restricted Boltzmann Machines and Constraints

Author: Gerhard Widmer
Maarten Grachten
Stefan Lattner
Publication venue: 'University of Huddersfield Press'
Publication date: 01/03/2018
Field of study

We introduce a method for imposing higher-level structure on generated, polyphonic music. A Convolutional Restricted Boltzmann Machine (C-RBM) as a generative model is combined with gradient des- cent constraint optimisation to provide further control over the generation process. Among other things, this allows for the use of a “template” piece, from which some structural properties can be extracted, and transferred as constraints to the newly generated material. The sampling pro- cess is guided with Simulated Annealing to avoid local optima, and to find solutions that both satisfy the constraints, and are relatively stable with respect to the C-RBM. Results show that with this approach it is possible to control the higher-level self-similarity structure, the meter, and the tonal properties of the resulting musical piece, while preserving its local musical coherence

arXiv.org e-Print Archive

Directory of Open Access Journals

Recommended from our members

Neural ProbabilisticModels for Melody Prediction, Sequence Labelling and Classification

Author: Cherla S
Publication venue
Publication date
Field of study

Data-driven sequence models have long played a role in the analysis and generation of musical information. Such models are of interest in computational musicology, computer-aided music composition, and tools for music education among other applications. This dissertation beginswith an experiment tomodel sequences of musical pitch in melodies with a class of purely data-driven predictive models collectively known as Connectionist models. It was demonstrated that a set of six such models could performon par with, or better than state-of-the-art n-gram models previously evaluated in an identical setting. A new model known as the Recurrent Temporal Discriminative Restricted Boltzmann Machine (RTDRBM), was introduced in the process and found to outperform the rest of the models. A generalisation of this modelling task was also explored, and involved extending the set of musical features used as input by the models while still predicting pitch as before. The improvement in predictive performance which resulted from adding these new input features is encouraging for future work in this direction. Based on the above success of the RTDRBM, its application was extended to a non-musical sequence labelling task, namely Optical Character Recognition. This extension involved a modification to the model’s original prediction algorithm as a result of relaxing an assumption specific to the melody modelling task. The generalised model was evaluated on a benchmark dataset and compared against a set of 8 baseline models where it faired better than all of them. Furthermore, a theoretical extension to an existingmodel which was also employed in the above pitch prediction task - the Discriminative Restricted Boltzmann Machine (DRBM) - was proposed. This led to three new variants of the DRBM (which originally contained Logistic Sigmoid hidden layer activations), withHyperbolic Tangent, Binomial and Rectified Linear hidden layer activations respectively. The first two of these have been evaluated here on the benchmark MNIST dataset and shown to perform on par with the original DRBM

City Research Online

NBP 2.0: Updated Next Bar Predictor, an Improved Algorithmic Music Generator

Author: Dungan Belinda M.
Fernandez Proceso L, Jr
Publication venue: Archīum Ateneo
Publication date: 01/01/2022
Field of study

Deep neural network advancements have enabled machines to produce melodies emulating human-composed music. However, the implementation of such machines is costly in terms of resources. In this paper, we present NBP 2.0, a refinement of the previous model next bar predictor (NBP) with two notable improvements: first, transforming each training instance to anchor all the notes to its musical scale, and second, changing the model architecture itself. NBP 2.0 maintained its straightforward and lightweight implementation, which is an advantage over the baseline models. Improvements were assessed using quantitative and qualitative metrics and, based on the results, the improvements from these changes made are notable

archīum.ATENEO (Ateneo de Manila Univ.)