36,005 research outputs found
A Corpus-based Study Of Rhythm Patterns
We present a corpus-based study of musical rhythm, based on a collection of 4.8 million bar-length drum patterns extracted from 48,176 pieces of symbolic music. Approaches to the analysis of rhythm in music information retrieval to date have focussed on low-level features for retrieval or on the detection of tempo, beats and drums in audio recordings. Musicological approaches are usually concerned with the description or implementation of manmade music theories. In this paper, we present a quantitative bottom-up approach to the study of rhythm that relies upon well-understood statistical methods from natural language processing. We adapt these methods to our corpus of music, based on the realisation that—unlike words—barlength drum patterns can be systematically decomposed into sub-patterns both in time and by instrument. We show that, in some respects, our rhythm corpus behaves like natural language corpora, particularly in the sparsity of vocabulary. The same methods that detect word collocations allow us to quantify and rank idiomatic combinations of drum patterns. In other respects, our corpus has properties absent from language corpora, in particular, the high amount of repetition and strong mutual information rates between drum instruments. Our findings may be of direct interest to musicians and musicologists, and can inform the design of ground truth corpora and computational models of musical rhythm. 1
Pop Music Highlighter: Marking the Emotion Keypoints
The goal of music highlight extraction is to get a short consecutive segment
of a piece of music that provides an effective representation of the whole
piece. In a previous work, we introduced an attention-based convolutional
recurrent neural network that uses music emotion classification as a surrogate
task for music highlight extraction, for Pop songs. The rationale behind that
approach is that the highlight of a song is usually the most emotional part.
This paper extends our previous work in the following two aspects. First,
methodology-wise we experiment with a new architecture that does not need any
recurrent layers, making the training process faster. Moreover, we compare a
late-fusion variant and an early-fusion variant to study which one better
exploits the attention mechanism. Second, we conduct and report an extensive
set of experiments comparing the proposed attention-based methods against a
heuristic energy-based method, a structural repetition-based method, and a few
other simple feature-based methods for this task. Due to the lack of
public-domain labeled data for highlight extraction, following our previous
work we use the RWC POP 100-song data set to evaluate how the detected
highlights overlap with any chorus sections of the songs. The experiments
demonstrate the effectiveness of our methods over competing methods. For
reproducibility, we open source the code and pre-trained model at
https://github.com/remyhuang/pop-music-highlighter/.Comment: Transactions of the ISMIR vol. 1, no.
The Skipping Behavior of Users of Music Streaming Services and its Relation to Musical Structure
The behavior of users of music streaming services is investigated from the
point of view of the temporal dimension of individual songs; specifically, the
main object of the analysis is the point in time within a song at which users
stop listening and start streaming another song ("skip"). The main contribution
of this study is the ascertainment of a correlation between the distribution in
time of skipping events and the musical structure of songs. It is also shown
that such distribution is not only specific to the individual songs, but also
independent of the cohort of users and, under stationary conditions, date of
observation. Finally, user behavioral data is used to train a predictor of the
musical structure of a song solely from its acoustic content; it is shown that
the use of such data, available in large quantities to music streaming
services, yields significant improvements in accuracy over the customary
fashion of training this class of algorithms, in which only smaller amounts of
hand-labeled data are available
- …