47 research outputs found

    Real-Time Audio-to-Score Alignment of Music Performances Containing Errors and Arbitrary Repeats and Skips

    Full text link
    This paper discusses real-time alignment of audio signals of music performance to the corresponding score (a.k.a. score following) which can handle tempo changes, errors and arbitrary repeats and/or skips (repeats/skips) in performances. This type of score following is particularly useful in automatic accompaniment for practices and rehearsals, where errors and repeats/skips are often made. Simple extensions of the algorithms previously proposed in the literature are not applicable in these situations for scores of practical length due to the problem of large computational complexity. To cope with this problem, we present two hidden Markov models of monophonic performance with errors and arbitrary repeats/skips, and derive efficient score-following algorithms with an assumption that the prior probability distributions of score positions before and after repeats/skips are independent from each other. We confirmed real-time operation of the algorithms with music scores of practical length (around 10000 notes) on a modern laptop and their tracking ability to the input performance within 0.7 s on average after repeats/skips in clarinet performance data. Further improvements and extension for polyphonic signals are also discussed.Comment: 12 pages, 8 figures, version accepted in IEEE/ACM Transactions on Audio, Speech, and Language Processin

    Statistical Piano Reduction Controlling Performance Difficulty

    Get PDF
    We present a statistical-modelling method for piano reduction, i.e. converting an ensemble score into piano scores, that can control performance difficulty. While previous studies have focused on describing the condition for playable piano scores, it depends on player's skill and can change continuously with the tempo. We thus computationally quantify performance difficulty as well as musical fidelity to the original score, and formulate the problem as optimization of musical fidelity under constraints on difficulty values. First, performance difficulty measures are developed by means of probabilistic generative models for piano scores and the relation to the rate of performance errors is studied. Second, to describe musical fidelity, we construct a probabilistic model integrating a prior piano-score model and a model representing how ensemble scores are likely to be edited. An iterative optimization algorithm for piano reduction is developed based on statistical inference of the model. We confirm the effect of the iterative procedure; we find that subjective difficulty and musical fidelity monotonically increase with controlled difficulty values; and we show that incorporating sequential dependence of pitches and fingering motion in the piano-score model improves the quality of reduction scores in high-difficulty cases.Comment: 12 pages, 7 figures, version accepted to APSIPA Transactions on Signal and Information Processin

    MULTI-STEP CHORD SEQUENCE PREDICTION BASED ON AGGREGATED MULTI-SCALE ENCODER-DECODER NETWORKS

    Get PDF
    International audienceThis paper studies the prediction of chord progressions for jazz music by relying on machine learning models. The motivation of our study comes from the recent success of neu-ral networks for performing automatic music composition. Although high accuracies are obtained in single-step prediction scenarios, most models fail to generate accurate multi-step chord predictions. In this paper, we postulate that this comes from the multi-scale structure of musical information and propose new architectures based on an iterative temporal aggregation of input labels. Specifically, the input and ground truth labels are merged into increasingly large temporal bags, on which we train a family of encoder-decoder networks for each temporal scale. In a second step, we use these pre-trained encoder bottleneck features at each scale in order to train a final encoder-decoder network. Furthermore, we rely on different reductions of the initial chord alphabet into three adapted chord alphabets. We perform evaluations against several state-of-the-art models and show that our multi-scale architecture outperforms existing methods in terms of accuracy and perplexity, while requiring relatively few parameters. We analyze musical properties of the results, showing the influence of downbeat position within the analysis window on accuracy , and evaluate errors using a musically-informed distance metric
    corecore