13,407 research outputs found

    The Temperament Police: The Truth, the Ground Truth, and Nothing but the Truth

    Get PDF
    The tuning system of a keyboard instrument is chosen so that frequently used musical intervals sound as consonant as possible. Temperament refers to the compromise arising from the fact that not all intervals can be maximally consonant simultaneously. Recent work showed that it is possible to estimate temperament from audio recordings with no prior knowledge of the musical score, using a conservative (high precision, low recall) automatic transcription algorithm followed by frequency estimation using quadratic interpolation and bias correction from the log magnitude spectrum. In this paper we develop a harpsichord-specific transcription system to analyse over 500 recordings of solo harpsichord music for which the temperament is specified on the CD sleeve notes. We compare the measured temperaments with the annotations and discuss the differences between temperament as a theoretical construct and as a practical issue for professional performers and tuners. The implications are that ground truth is not always scientific truth, and that content-based analysis has an important role in the study of historical performance practice. 1

    Automatic comparison of global children’s and adult songs supports a sensorimotor hypothesis for the origin of musical scales

    Get PDF
    Music throughout the world varies greatly, yet some musical features like scale structure display striking crosscultural similarities. Are there musical laws or biological constraints that underlie this diversity? The “vocal mistuning” hypothesis proposes that cross-cultural regularities in musical scales arise from imprecision in vocal tuning, while the integer-ratio hypothesis proposes that they arise from perceptual principles based on psychoacoustic consonance. In order to test these hypotheses, we conducted automatic comparative analysis of 100 children’s and adult songs from throughout the world. We found that children’s songs tend to have narrower melodic range, fewer scale degrees, and less precise intonation than adult songs, consistent with motor limitations due to their earlier developmental stage. On the other hand, adult and children’s songs share some common tuning intervals at small-integer ratios, particularly the perfect 5th (~3:2 ratio). These results suggest that some widespread aspects of musical scales may be caused by motor constraints, but also suggest that perceptual preferences for simple integer ratios might contribute to cross-cultural regularities in scale structure. We propose a “sensorimotor hypothesis” to unify these competing theories

    On the Mathematics of Music: From Chords to Fourier Analysis

    Full text link
    Mathematics is a far reaching discipline and its tools appear in many applications. In this paper we discuss its role in music and signal processing by revisiting the use of mathematics in algorithms that can extract chord information from recorded music. We begin with a light introduction to the theory of music and motivate the use of Fourier analysis in audio processing. We introduce the discrete and continuous Fourier transforms and investigate their use in extracting important information from audio data

    Timbre-invariant Audio Features for Style Analysis of Classical Music

    Get PDF
    Copyright: (c) 2014 Christof Weiß et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

    Automated annotation of multimedia audio data with affective labels for information management

    Get PDF
    The emergence of digital multimedia systems is creating many new opportunities for rapid access to huge content archives. In order to fully exploit these information sources, the content must be annotated with significant features. An important aspect of human interpretation of multimedia data, which is often overlooked, is the affective dimension. Such information is a potentially useful component for content-based classification and retrieval. Much of the affective information of multimedia content is contained within the audio data stream. Emotional features can be defined in terms of arousal and valence levels. In this study low-level audio features are extracted to calculate arousal and valence levels of multimedia audio streams. These are then mapped onto a set of keywords with predetermined emotional interpretations. Experimental results illustrate the use of this system to assign affective annotation to multimedia data
    corecore