1,328 research outputs found

    Pitchclass2vec: Symbolic Music Structure Segmentation with Chord Embeddings

    Get PDF
    Structure perception is a fundamental aspect of music cognition in humans. Historically, the hierarchical organization of music into structures served as a narrative device for conveying meaning, creating expectancy, and evoking emotions in the listener. Thereby, musical structures play an essential role in music composition, as they shape the musical discourse through which the composer organises his ideas. In this paper, we present a novel music segmentation method, pitchclass2vec, based on symbolic chord annotations, which are embedded into continuous vector representations using both natural language processing techniques and custom-made encodings. Our algorithm is based on long-short term memory (LSTM) neural network and outperforms the state-of-the-art techniques based on symbolic chord annotations in the field

    Visualizing music structure using Spotify data

    Get PDF

    Automatic chord transcription from audio using computational models of musical context

    Get PDF
    PhDThis thesis is concerned with the automatic transcription of chords from audio, with an emphasis on modern popular music. Musical context such as the key and the structural segmentation aid the interpretation of chords in human beings. In this thesis we propose computational models that integrate such musical context into the automatic chord estimation process. We present a novel dynamic Bayesian network (DBN) which integrates models of metric position, key, chord, bass note and two beat-synchronous audio features (bass and treble chroma) into a single high-level musical context model. We simultaneously infer the most probable sequence of metric positions, keys, chords and bass notes via Viterbi inference. Several experiments with real world data show that adding context parameters results in a significant increase in chord recognition accuracy and faithfulness of chord segmentation. The proposed, most complex method transcribes chords with a state-of-the-art accuracy of 73% on the song collection used for the 2009 MIREX Chord Detection tasks. This method is used as a baseline method for two further enhancements. Firstly, we aim to improve chord confusion behaviour by modifying the audio front end processing. We compare the effect of learning chord profiles as Gaussian mixtures to the effect of using chromagrams generated from an approximate pitch transcription method. We show that using chromagrams from approximate transcription results in the most substantial increase in accuracy. The best method achieves 79% accuracy and significantly outperforms the state of the art. Secondly, we propose a method by which chromagram information is shared between repeated structural segments (such as verses) in a song. This can be done fully automatically using a novel structural segmentation algorithm tailored to this task. We show that the technique leads to a significant increase in accuracy and readability. The segmentation algorithm itself also obtains state-of-the-art results. A method that combines both of the above enhancements reaches an accuracy of 81%, a statistically significant improvement over the best result (74%) in the 2009 MIREX Chord Detection tasks.Engineering and Physical Research Council U

    An Issue between Contemporary Theory and Modern Compositional Practice A Study of Joseph Straus's Laws of Atonal Voice Leading and Harmony using Webern's Opus 12/2 and Crawford's String Quartet Mvt. 3

    Get PDF
    In his recent research project, music theorist Joseph Straus extends the traditional notions of smooth voice leading and the quality of harmony in tonal music to describe atonal voice leading and harmony. To achieve this goal, Straus proposes a theory called fuzzy transformations to analyze atonal music. Based on his findings, he further concludes the law of atonal voice leading and that of atonal harmony, which state that compositions, especially those in "more conservative styles," do obey these two laws. To test the validity of Straus's laws, I use Crawford's String Quartet Mvt. 3 and Webern's song op. 12/2 as case studies, examining the potential strengths and inherent weaknesses in Straus's fuzzy transformations, and further pointing out a conflict between music theory and compositional practice

    SCHUBOT: Machine Learning Tools for the Automated Analysis of Schubert’s Lieder

    Get PDF
    This paper compares various methods for automated musical analysis, applying machine learning techniques to gain insight about the Lieder (art songs) of com- poser Franz Schubert (1797-1828). Known as a rule-breaking, individualistic, and adventurous composer, Schubert produced hundreds of emotionally-charged songs that have challenged music theorists to this day. The algorithms presented in this paper analyze the harmonies, melodies, and texts of these songs. This paper begins with an exploration of the relevant music theory and ma- chine learning algorithms (Chapter 1), alongside a general discussion of the place Schubert holds within the world of music theory. The focus is then turned to automated harmonic analysis and hierarchical decomposition of MusicXML data, presenting new algorithms for phrase-based analysis in the context of past research (Chapter 2). Melodic analysis is then discussed (Chapter 3), using unsupervised clustering methods as a complement to harmonic analyses. This paper then seeks to analyze the texts Schubert chose for his songs in the context of the songs’ relevant musical features (Chapter 4), combining natural language processing with feature extraction to pinpoint trends in Schubert’s career

    Computational Tonality Estimation: Signal Processing and Hidden Markov Models

    Get PDF
    PhDThis thesis investigates computational musical tonality estimation from an audio signal. We present a hidden Markov model (HMM) in which relationships between chords and keys are expressed as probabilities of emitting observable chords from a hidden key sequence. The model is tested first using symbolic chord annotations as observations, and gives excellent global key recognition rates on a set of Beatles songs. The initial model is extended for audio input by using an existing chord recognition algorithm, which allows it to be tested on a much larger database. We show that a simple model of the upper partials in the signal improves percentage scores. We also present a variant of the HMM which has a continuous observation probability density, but show that the discrete version gives better performance. Then follows a detailed analysis of the effects on key estimation and computation time of changing the low level signal processing parameters. We find that much of the high frequency information can be omitted without loss of accuracy, and significant computational savings can be made by applying a threshold to the transform kernels. Results show that there is no single ideal set of parameters for all music, but that tuning the parameters can make a difference to accuracy. We discuss methods of evaluating more complex tonal changes than a single global key, and compare a metric that measures similarity to a ground truth to metrics that are rooted in music retrieval. We show that the two measures give different results, and so recommend that the choice of evaluation metric is determined by the intended application. Finally we draw together our conclusions and use them to suggest areas for continuation of this research, in the areas of tonality model development, feature extraction, evaluation methodology, and applications of computational tonality estimation.Engineering and Physical Sciences Research Council (EPSRC)

    AAM: a dataset of Artificial Audio Multitracks for diverse music information retrieval tasks

    Get PDF
    We present a new dataset of 3000 artificial music tracks with rich annotations based on real instrument samples and generated by algorithmic composition with respect to music theory. Our collection provides ground truth onset information and has several advantages compared to many available datasets. It can be used to compare and optimize algorithms for various music information retrieval tasks like music segmentation, instrument recognition, source separation, onset detection, key and chord recognition, or tempo estimation. As the audio is perfectly aligned to original MIDIs, all annotations (onsets, pitches, instruments, keys, tempos, chords, beats, and segment boundaries) are absolutely precise. Because of that, specific scenarios can be addressed, for instance, detection of segment boundaries with instrument and key change only, or onset detection only in tracks with drums and slow tempo. This allows for the exhaustive evaluation and identification of individual weak points of algorithms. In contrast to datasets with commercial music, all audio tracks are freely available, allowing for extraction of own audio features. All music pieces are stored as single instrument audio tracks and a mix track, so that different augmentations and DSP effects can be applied to extend training sets and create individual mixes, e.g., for deep neural networks. In three case studies, we show how different algorithms and neural network models can be analyzed and compared for music segmentation, instrument recognition, and onset detection. In future, the dataset can be easily extended under consideration of specific demands to the composition process
    corecore