6,296 research outputs found
ATEPP: A DATASET OF AUTOMATICALLY TRANSCRIBED EXPRESSIVE PIANO PERFORMANCE
Computational models of expressive piano performance rely on attributes like tempo, timing, dynamics and pedalling. Despite some promising models for performance assessment and performance rendering, results are limited by the scale, breadth and uniformity of existing datasets. In this paper, we present ATEPP, a dataset that contains 1000 hours of performances of standard piano repertoire by 49 world-renowned pianists, organized and aligned by compositions and movements for comparative studies. Scores in MusicXML format are also available for around half of the tracks. We first evaluate and verify the use of transcribed MIDI for representing expressive performance with a listening evaluation that involves recent transcription models. Then, the process of sourcing and curating the dataset is outlined, including composition entity resolution and a pipeline for audio matching and solo filtering. Finally, we conduct baseline experiments for performer identification and performance rendering on our datasets, demonstrating its potential in generalizing expressive features of individual performing style
Online Symbolic Music Alignment with Offline Reinforcement Learning
Symbolic Music Alignment is the process of matching performed MIDI notes to
corresponding score notes. In this paper, we introduce a reinforcement learning
(RL)-based online symbolic music alignment technique. The RL agent - an
attention-based neural network - iteratively estimates the current score
position from local score and performance contexts. For this symbolic alignment
task, environment states can be sampled exhaustively and the reward is dense,
rendering a formulation as a simplified offline RL problem straightforward. We
evaluate the trained agent in three ways. First, in its capacity to identify
correct score positions for sampled test contexts; second, as the core
technique of a complete algorithm for symbolic online note-wise alignment; and
finally, as a real-time symbolic score follower. We further investigate the
pitch-based score and performance representations used as the agent's inputs.
To this end, we develop a second model, a two-step Dynamic Time Warping
(DTW)-based offline alignment algorithm leveraging the same input
representation. The proposed model outperforms a state-of-the-art reference
model of offline symbolic music alignment
Reconstructing Human Expressiveness in Piano Performances with a Transformer Network
Capturing intricate and subtle variations in human expressiveness in music
performance using computational approaches is challenging. In this paper, we
propose a novel approach for reconstructing human expressiveness in piano
performance with a multi-layer bi-directional Transformer encoder. To address
the needs for large amounts of accurately captured and score-aligned
performance data in training neural networks, we use transcribed scores
obtained from an existing transcription model to train our model. We integrate
pianist identities to control the sampling process and explore the ability of
our system to model variations in expressiveness for different pianists. The
system is evaluated through statistical analysis of generated expressive
performances and a listening test. Overall, the results suggest that our method
achieves state-of-the-art in generating human-like piano performances from
transcribed scores, while fully and consistently reconstructing human
expressiveness poses further challenges.Comment: 12 pages, 5 figures, submitted to CMMR 202
Interactive real-time musical systems
PhDThis thesis focuses on the development of automatic accompaniment systems.
We investigate previous systems and look at a range of approaches
that have been attempted for the problem of beat tracking. Most beat
trackers are intended for the purposes of music information retrieval where
a `black box' approach is tested on a wide variety of music genres. We
highlight some of the diffculties facing offline beat trackers and design a
new approach for the problem of real-time drum tracking, developing a
system, B-Keeper, which makes reasonable assumptions on the nature of
the signal and is provided with useful prior knowledge.
Having developed the system with offline studio recordings, we look to
test the system with human players. Existing offline evaluation methods
seem less suitable for a performance system, since we also wish to evaluate
the interaction between musician and machine. Although statistical data
may reveal quantifiable measurements of the system's predictions and behaviour,
we also want to test how well it functions within the context of a
live performance. To do so, we devise an evaluation strategy to contrast
a machine-controlled accompaniment with one controlled by a human.
We also present recent work on a real-time multiple pitch tracking,
which is then extended to provide automatic accompaniment for harmonic
instruments such as guitar. By aligning salient notes in the output from
a dual pitch tracking process, we make changes to the tempo of the
accompaniment in order to align it with a live stream. By demonstrating
the system's ability to align offline tracks, we can show that under
restricted initial conditions, the algorithm works well as an alignment tool
Deep Learning Techniques for Music Generation -- A Survey
This paper is a survey and an analysis of different ways of using deep
learning (deep artificial neural networks) to generate musical content. We
propose a methodology based on five dimensions for our analysis:
Objective - What musical content is to be generated? Examples are: melody,
polyphony, accompaniment or counterpoint. - For what destination and for what
use? To be performed by a human(s) (in the case of a musical score), or by a
machine (in the case of an audio file).
Representation - What are the concepts to be manipulated? Examples are:
waveform, spectrogram, note, chord, meter and beat. - What format is to be
used? Examples are: MIDI, piano roll or text. - How will the representation be
encoded? Examples are: scalar, one-hot or many-hot.
Architecture - What type(s) of deep neural network is (are) to be used?
Examples are: feedforward network, recurrent network, autoencoder or generative
adversarial networks.
Challenge - What are the limitations and open challenges? Examples are:
variability, interactivity and creativity.
Strategy - How do we model and control the process of generation? Examples
are: single-step feedforward, iterative feedforward, sampling or input
manipulation.
For each dimension, we conduct a comparative analysis of various models and
techniques and we propose some tentative multidimensional typology. This
typology is bottom-up, based on the analysis of many existing deep-learning
based systems for music generation selected from the relevant literature. These
systems are described and are used to exemplify the various choices of
objective, representation, architecture, challenge and strategy. The last
section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P.
Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music
Generation, Computational Synthesis and Creative Systems, Springer, 201
- …