390 research outputs found
Statistical Piano Reduction Controlling Performance Difficulty
We present a statistical-modelling method for piano reduction, i.e.
converting an ensemble score into piano scores, that can control performance
difficulty. While previous studies have focused on describing the condition for
playable piano scores, it depends on player's skill and can change continuously
with the tempo. We thus computationally quantify performance difficulty as well
as musical fidelity to the original score, and formulate the problem as
optimization of musical fidelity under constraints on difficulty values. First,
performance difficulty measures are developed by means of probabilistic
generative models for piano scores and the relation to the rate of performance
errors is studied. Second, to describe musical fidelity, we construct a
probabilistic model integrating a prior piano-score model and a model
representing how ensemble scores are likely to be edited. An iterative
optimization algorithm for piano reduction is developed based on statistical
inference of the model. We confirm the effect of the iterative procedure; we
find that subjective difficulty and musical fidelity monotonically increase
with controlled difficulty values; and we show that incorporating sequential
dependence of pitches and fingering motion in the piano-score model improves
the quality of reduction scores in high-difficulty cases.Comment: 12 pages, 7 figures, version accepted to APSIPA Transactions on
Signal and Information Processin
Real-Time Audio-to-Score Alignment of Music Performances Containing Errors and Arbitrary Repeats and Skips
This paper discusses real-time alignment of audio signals of music
performance to the corresponding score (a.k.a. score following) which can
handle tempo changes, errors and arbitrary repeats and/or skips (repeats/skips)
in performances. This type of score following is particularly useful in
automatic accompaniment for practices and rehearsals, where errors and
repeats/skips are often made. Simple extensions of the algorithms previously
proposed in the literature are not applicable in these situations for scores of
practical length due to the problem of large computational complexity. To cope
with this problem, we present two hidden Markov models of monophonic
performance with errors and arbitrary repeats/skips, and derive efficient
score-following algorithms with an assumption that the prior probability
distributions of score positions before and after repeats/skips are independent
from each other. We confirmed real-time operation of the algorithms with music
scores of practical length (around 10000 notes) on a modern laptop and their
tracking ability to the input performance within 0.7 s on average after
repeats/skips in clarinet performance data. Further improvements and extension
for polyphonic signals are also discussed.Comment: 12 pages, 8 figures, version accepted in IEEE/ACM Transactions on
Audio, Speech, and Language Processin
Predicting performance difficulty from piano sheet music images
Estimating the performance difficulty of a musical score is crucial in music
education for adequately designing the learning curriculum of the students.
Although the Music Information Retrieval community has recently shown interest
in this task, existing approaches mainly use machine-readable scores, leaving
the broader case of sheet music images unaddressed. Based on previous works
involving sheet music images, we use a mid-level representation, bootleg score,
describing notehead positions relative to staff lines coupled with a
transformer model. This architecture is adapted to our task by introducing an
encoding scheme that reduces the encoded sequence length to one-eighth of the
original size. In terms of evaluation, we consider five datasets -- more than
7500 scores with up to 9 difficulty levels -- , two of them particularly
compiled for this work. The results obtained when pretraining the scheme on the
IMSLP corpus and fine-tuning it on the considered datasets prove the proposal's
validity, achieving the best-performing model with a balanced accuracy of
40.34\% and a mean square error of 1.33. Finally, we provide access to our
code, data, and models for transparency and reproducibility
Multimodal music information processing and retrieval: survey and future challenges
Towards improving the performance in various music information processing
tasks, recent studies exploit different modalities able to capture diverse
aspects of music. Such modalities include audio recordings, symbolic music
scores, mid-level representations, motion, and gestural data, video recordings,
editorial or cultural tags, lyrics and album cover arts. This paper critically
reviews the various approaches adopted in Music Information Processing and
Retrieval and highlights how multimodal algorithms can help Music Computing
applications. First, we categorize the related literature based on the
application they address. Subsequently, we analyze existing information fusion
approaches, and we conclude with the set of challenges that Music Information
Retrieval and Sound and Music Computing research communities should focus in
the next years
Automatic recognition of Persian musical modes in audio musical signals
This research proposes new approaches for computational identification of Persian musical modes. This involves constructing a database of audio musical files and developing computer algorithms to perform a musical analysis of the samples. Essential features, the spectral average, chroma, and pitch histograms, and the use of symbolic data, are discussed and compared. A tonic detection algorithm is developed to align the feature vectors and to make the mode recognition methods independent of changes in tonality. Subsequently, a geometric distance measure, such as the Manhattan distance, which is preferred, and cross correlation, or a machine learning method (the Gaussian Mixture Models), is used to gauge similarity between a signal and a set of templates that are constructed in the training phase, in which data-driven patterns are made for each dastgàh (Persian mode). The effects of the following parameters are considered and assessed: the amount of training data; the parts of the frequency range to be used for training; down sampling; tone resolution (12-TET, 24-TET, 48-TET and 53-TET); the effect of using overlapping or nonoverlapping frames; and silence and high-energy suppression in pre-processing. The santur (hammered string instrument), which is extensively used in the musical database samples, is described and its physical properties are characterised; the pitch and harmonic deviations characteristic of it are measured; and the inharmonicity factor of the instrument is calculated for the first time.
The results are applicable to Persian music and to other closely related musical traditions of the Mediterranean and the Near East. This approach enables content-based analyses of, and content-based searches of, musical archives. Potential applications of this research include: music information retrieval, audio snippet (thumbnailing), music archiving and access to archival content, audio compression and coding, associating of images with audio content, music transcription, music synthesis, music editors, music instruction, automatic music accompaniment, and setting new standards and symbols for musical notation
Automatic annotation of musical audio for interactive applications
PhDAs machines become more and more portable, and part of our everyday life, it becomes
apparent that developing interactive and ubiquitous systems is an important
aspect of new music applications created by the research community. We are interested
in developing a robust layer for the automatic annotation of audio signals, to
be used in various applications, from music search engines to interactive installations,
and in various contexts, from embedded devices to audio content servers. We
propose adaptations of existing signal processing techniques to a real time context.
Amongst these annotation techniques, we concentrate on low and mid-level tasks
such as onset detection, pitch tracking, tempo extraction and note modelling. We
present a framework to extract these annotations and evaluate the performances of
different algorithms.
The first task is to detect onsets and offsets in audio streams within short latencies.
The segmentation of audio streams into temporal objects enables various
manipulation and analysis of metrical structure. Evaluation of different algorithms
and their adaptation to real time are described. We then tackle the problem of
fundamental frequency estimation, again trying to reduce both the delay and the
computational cost. Different algorithms are implemented for real time and experimented
on monophonic recordings and complex signals. Spectral analysis can be
used to label the temporal segments; the estimation of higher level descriptions is
approached. Techniques for modelling of note objects and localisation of beats are
implemented and discussed.
Applications of our framework include live and interactive music installations,
and more generally tools for the composers and sound engineers. Speed optimisations
may bring a significant improvement to various automated tasks, such as
automatic classification and recommendation systems. We describe the design of
our software solution, for our research purposes and in view of its integration within
other systems.EU-FP6-IST-507142 project SIMAC (Semantic Interaction with Music
Audio Contents);
EPSRC grants GR/R54620; GR/S75802/01
Proceedings of the 7th Sound and Music Computing Conference
Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010
- …