76 research outputs found
Measuring the evolution of contemporary western popular music
Popular music is a key cultural expression that has captured listeners'
attention for ages. Many of the structural regularities underlying musical
discourse are yet to be discovered and, accordingly, their historical evolution
remains formally unknown. Here we unveil a number of patterns and metrics
characterizing the generic usage of primary musical facets such as pitch,
timbre, and loudness in contemporary western popular music. Many of these
patterns and metrics have been consistently stable for a period of more than
fifty years, thus pointing towards a great degree of conventionalism.
Nonetheless, we prove important changes or trends related to the restriction of
pitch transitions, the homogenization of the timbral palette, and the growing
loudness levels. This suggests that our perception of the new would be rooted
on these changing characteristics. Hence, an old tune could perfectly sound
novel and fashionable, provided that it consisted of common harmonic
progressions, changed the instrumentation, and increased the average loudness.Comment: Supplementary materials not included. Please see the journal
reference or contact the author
Revisiting the problem of audio-based hit song prediction using convolutional neural networks
Being able to predict whether a song can be a hit has impor- tant
applications in the music industry. Although it is true that the popularity of
a song can be greatly affected by exter- nal factors such as social and
commercial influences, to which degree audio features computed from musical
signals (whom we regard as internal factors) can predict song popularity is an
interesting research question on its own. Motivated by the recent success of
deep learning techniques, we attempt to ex- tend previous work on hit song
prediction by jointly learning the audio features and prediction models using
deep learning. Specifically, we experiment with a convolutional neural net-
work model that takes the primitive mel-spectrogram as the input for feature
learning, a more advanced JYnet model that uses an external song dataset for
supervised pre-training and auto-tagging, and the combination of these two
models. We also consider the inception model to characterize audio infor-
mation in different scales. Our experiments suggest that deep structures are
indeed more accurate than shallow structures in predicting the popularity of
either Chinese or Western Pop songs in Taiwan. We also use the tags predicted
by JYnet to gain insights into the result of different models.Comment: To appear in the proceedings of 2017 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP
Log-log Convexity of Type-Token Growth in Zipf's Systems
It is traditionally assumed that Zipf's law implies the power-law growth of
the number of different elements with the total number of elements in a system
- the so-called Heaps' law. We show that a careful definition of Zipf's law
leads to the violation of Heaps' law in random systems, and obtain alternative
growth curves. These curves fulfill universal data collapses that only depend
on the value of the Zipf's exponent. We observe that real books behave very
much in the same way as random systems, despite the presence of burstiness in
word occurrence. We advance an explanation for this unexpected correspondence
The Ubuweb Electronic Music Corpus: An MIR investigation of a historical database
A corpus of historical electronic art music is available online from the UbuWeb art resource site. Though the corpus has some flaws in its historical and cultural coverage (not least of which is an over-abundance of male composers), it provides an interesting test ground for automated electronic music analysis, and one which is available to other researchers for reproducible work. We deploy open source tools for music information retrieval; the code from this project is made freely available under the GNU GPL 3 for others to explore. Key findings include the contrasting performance of single summary statistics for works versus time series models, visualisations of trends over chronological time in audio features, the difficulty of predicting which year a given piece is from, and further illumination of the possibilities and challenges of automated music analysis
OK Computer Analysis: An Audio Corpus Study of Radiohead
The application of music information retrieval techniques in popular music studies has great promise. In the present work, a corpus of Radiohead songs across their career from 1992 to 2017 are subjected to automated audio analysis. We examine findings from a number of granularities and perspectives, including within song and between song examination of both timbral-rhythmic and harmonic features. Chronological changes include possible career spanning effects for a band's releases such as slowing tempi and reduced brightness, and the timbral markers of Radiohead's expanding approach to instrumental resources most identified with the Kid A and Amnesiac era. We conclude with a discussion highlighting some challenges for this approach, and the potential for a field of audio file based career analysis
Large-scale analysis of Zipf's law in English texts
Despite being a paradigm of quantitative linguistics, Zipf's law for words
suffers from three main problems: its formulation is ambiguous, its validity
has not been tested rigorously from a statistical point of view, and it has not
been confronted to a representatively large number of texts. So, we can
summarize the current support of Zipf's law in texts as anecdotic.
We try to solve these issues by studying three different versions of Zipf's
law and fitting them to all available English texts in the Project Gutenberg
database (consisting of more than 30000 texts). To do so we use state-of-the
art tools in fitting and goodness-of-fit tests, carefully tailored to the
peculiarities of text statistics. Remarkably, one of the three versions of
Zipf's law, consisting of a pure power-law form in the complementary cumulative
distribution function of word frequencies, is able to fit more than 40% of the
texts in the database (at the 0.05 significance level), for the whole domain of
frequencies (from 1 to the maximum value) and with only one free parameter (the
exponent)
- …