3,882 research outputs found

    Extended pipeline for content-based feature engineering in music genre recognition

    Full text link
    We present a feature engineering pipeline for the construction of musical signal characteristics, to be used for the design of a supervised model for musical genre identification. The key idea is to extend the traditional two-step process of extraction and classification with additive stand-alone phases which are no longer organized in a waterfall scheme. The whole system is realized by traversing backtrack arrows and cycles between various stages. In order to give a compact and effective representation of the features, the standard early temporal integration is combined with other selection and extraction phases: on the one hand, the selection of the most meaningful characteristics based on information gain, and on the other hand, the inclusion of the nonlinear correlation between this subset of features, determined by an autoencoder. The results of the experiments conducted on GTZAN dataset reveal a noticeable contribution of this methodology towards the model's performance in classification task.Comment: ICASSP 201

    Automatic Transcription of Bass Guitar Tracks applied for Music Genre Classification and Sound Synthesis

    Get PDF
    ï»żMusiksignale bestehen in der Regel aus einer Überlagerung mehrerer Einzelinstrumente. Die meisten existierenden Algorithmen zur automatischen Transkription und Analyse von Musikaufnahmen im Forschungsfeld des Music Information Retrieval (MIR) versuchen, semantische Information direkt aus diesen gemischten Signalen zu extrahieren. In den letzten Jahren wurde hĂ€ufig beobachtet, dass die LeistungsfĂ€higkeit dieser Algorithmen durch die SignalĂŒberlagerungen und den daraus resultierenden Informationsverlust generell limitiert ist. Ein möglicher Lösungsansatz besteht darin, mittels Verfahren der Quellentrennung die beteiligten Instrumente vor der Analyse klanglich zu isolieren. Die LeistungsfĂ€higkeit dieser Algorithmen ist zum aktuellen Stand der Technik jedoch nicht immer ausreichend, um eine sehr gute Trennung der Einzelquellen zu ermöglichen. In dieser Arbeit werden daher ausschließlich isolierte Instrumentalaufnahmen untersucht, die klanglich nicht von anderen Instrumenten ĂŒberlagert sind. Exemplarisch werden anhand der elektrischen Bassgitarre auf die Klangerzeugung dieses Instrumentes hin spezialisierte Analyse- und Klangsynthesealgorithmen entwickelt und evaluiert.Im ersten Teil der vorliegenden Arbeit wird ein Algorithmus vorgestellt, der eine automatische Transkription von Bassgitarrenaufnahmen durchfĂŒhrt. Dabei wird das Audiosignal durch verschiedene Klangereignisse beschrieben, welche den gespielten Noten auf dem Instrument entsprechen. Neben den ĂŒblichen Notenparametern Anfang, Dauer, LautstĂ€rke und Tonhöhe werden dabei auch instrumentenspezifische Parameter wie die verwendeten Spieltechniken sowie die Saiten- und Bundlage auf dem Instrument automatisch extrahiert. Evaluationsexperimente anhand zweier neu erstellter AudiodatensĂ€tze belegen, dass der vorgestellte Transkriptionsalgorithmus auf einem Datensatz von realistischen Bassgitarrenaufnahmen eine höhere Erkennungsgenauigkeit erreichen kann als drei existierende Algorithmen aus dem Stand der Technik. Die SchĂ€tzung der instrumentenspezifischen Parameter kann insbesondere fĂŒr isolierte Einzelnoten mit einer hohen GĂŒte durchgefĂŒhrt werden.Im zweiten Teil der Arbeit wird untersucht, wie aus einer Notendarstellung typischer sich wieder- holender Basslinien auf das Musikgenre geschlossen werden kann. Dabei werden Audiomerkmale extrahiert, welche verschiedene tonale, rhythmische, und strukturelle Eigenschaften von Basslinien quantitativ beschreiben. Mit Hilfe eines neu erstellten Datensatzes von 520 typischen Basslinien aus 13 verschiedenen Musikgenres wurden drei verschiedene AnsĂ€tze fĂŒr die automatische Genreklassifikation verglichen. Dabei zeigte sich, dass mit Hilfe eines regelbasierten Klassifikationsverfahrens nur Anhand der Analyse der Basslinie eines MusikstĂŒckes bereits eine mittlere Erkennungsrate von 64,8 % erreicht werden konnte.Die Re-synthese der originalen Bassspuren basierend auf den extrahierten Notenparametern wird im dritten Teil der Arbeit untersucht. Dabei wird ein neuer Audiosynthesealgorithmus vorgestellt, der basierend auf dem Prinzip des Physical Modeling verschiedene Aspekte der fĂŒr die Bassgitarre charakteristische Klangerzeugung wie Saitenanregung, DĂ€mpfung, Kollision zwischen Saite und Bund sowie dem Tonabnehmerverhalten nachbildet. Weiterhin wird ein parametrischerAudiokodierungsansatz diskutiert, der es erlaubt, Bassgitarrenspuren nur anhand der ermittel- ten notenweisen Parameter zu ĂŒbertragen um sie auf Dekoderseite wieder zu resynthetisieren. Die Ergebnisse mehrerer Hötest belegen, dass der vorgeschlagene Synthesealgorithmus eine Re- Synthese von Bassgitarrenaufnahmen mit einer besseren KlangqualitĂ€t ermöglicht als die Übertragung der Audiodaten mit existierenden Audiokodierungsverfahren, die auf sehr geringe Bitraten ein gestellt sind.Music recordings most often consist of multiple instrument signals, which overlap in time and frequency. In the field of Music Information Retrieval (MIR), existing algorithms for the automatic transcription and analysis of music recordings aim to extract semantic information from mixed audio signals. In the last years, it was frequently observed that the algorithm performance is limited due to the signal interference and the resulting loss of information. One common approach to solve this problem is to first apply source separation algorithms to isolate the present musical instrument signals before analyzing them individually. The performance of source separation algorithms strongly depends on the number of instruments as well as on the amount of spectral overlap.In this thesis, isolated instrumental tracks are analyzed in order to circumvent the challenges of source separation. Instead, the focus is on the development of instrument-centered signal processing algorithms for music transcription, musical analysis, as well as sound synthesis. The electric bass guitar is chosen as an example instrument. Its sound production principles are closely investigated and considered in the algorithmic design.In the first part of this thesis, an automatic music transcription algorithm for electric bass guitar recordings will be presented. The audio signal is interpreted as a sequence of sound events, which are described by various parameters. In addition to the conventionally used score-level parameters note onset, duration, loudness, and pitch, instrument-specific parameters such as the applied instrument playing techniques and the geometric position on the instrument fretboard will be extracted. Different evaluation experiments confirmed that the proposed transcription algorithm outperformed three state-of-the-art bass transcription algorithms for the transcription of realistic bass guitar recordings. The estimation of the instrument-level parameters works with high accuracy, in particular for isolated note samples.In the second part of the thesis, it will be investigated, whether the sole analysis of the bassline of a music piece allows to automatically classify its music genre. Different score-based audio features will be proposed that allow to quantify tonal, rhythmic, and structural properties of basslines. Based on a novel data set of 520 bassline transcriptions from 13 different music genres, three approaches for music genre classification were compared. A rule-based classification system could achieve a mean class accuracy of 64.8 % by only taking features into account that were extracted from the bassline of a music piece.The re-synthesis of a bass guitar recordings using the previously extracted note parameters will be studied in the third part of this thesis. Based on the physical modeling of string instruments, a novel sound synthesis algorithm tailored to the electric bass guitar will be presented. The algorithm mimics different aspects of the instrument’s sound production mechanism such as string excitement, string damping, string-fret collision, and the influence of the electro-magnetic pickup. Furthermore, a parametric audio coding approach will be discussed that allows to encode and transmit bass guitar tracks with a significantly smaller bit rate than conventional audio coding algorithms do. The results of different listening tests confirmed that a higher perceptual quality can be achieved if the original bass guitar recordings are encoded and re-synthesized using the proposed parametric audio codec instead of being encoded using conventional audio codecs at very low bit rate settings

    The Role of a Polyrhythm’s Pitch Interval in Music-Dependent Memory

    Get PDF
    When listening to music, humans can easily and often automatically assess the perceptual similarity of different moments in music. However, it is difficult to rigorously define the way in which we determine exactly how similar we find to moments to be. This problem has driven inquiry in music cognition, musicology, and music theory alike, but previous results have depended on behaviorally mediated responses and/or recursive analytic strategies by music scholars. The present work employs the context-dependent memory paradigm as a novel way to investigate the extent to which listeners consider two musical examples to be similar. After incidentally learning words while listening to a 5:4 polyrhythm forming a perfect fifth, participants could hear no sound or the polyrhythm at a different pitch interval during a surprise test of recall. Between-subjects comparisons found no effect of the actual sound context at test on recall; however, participants who reported being in the same sound context did recall significantly more words than others. Interactions between actual and reported sound context were not accounted for by musical experience or other participant factors, and reported sound context was more often incompatible than compatible with actual sound context. Contributions to mental context theory and the boundaries of conclusions about musical features are discussed

    Semantic annotation of digital music

    Get PDF
    AbstractIn recent times, digital music items on the internet have been evolving in a vast information space where consumers try to find/locate the piece of music of their choice by means of search engines. The current trend of searching for music by means of music consumersÊŒ keywords/tags is unable to provide satisfactory search results. It is argued that search and retrieval of music can be significantly improved provided end-usersÊŒ tags are associated with semantic information in terms of acoustic metadata – the latter being easy to extract automatically from digital music items. This paper presents a lightweight ontology that will enable music producers to annotate music against MPEG-7 description (with its acoustic metadata) and the generated annotation may in turn be used to deliver meaningful search results. Several potential multimedia ontologies have been explored and a music annotation ontology, named mpeg-7Music, has been designed so that it can be used as a backbone for annotating music items

    Automatic music genre classification

    Get PDF
    A dissertation submitted to the Faculty of Science, University of the Witwatersrand, in fulfillment of the requirements for the degree of Master of Science. 2014.No abstract provided

    Beat histogram features for rhythm-based musical genre classification using multiple novelty functions

    Get PDF
    In this paper we present beat histogram features for multiple level rhythm description and evaluate them in a musical genre classification task. Audio features pertaining to various musical content categories and their related novelty functions are extracted as a basis for the creation of beat histograms. The proposed features capture not only amplitude, but also tonal and general spectral changes in the signal, aiming to represent as much rhythmic information as possible. The most and least informative features are identified through feature selection methods and are then tested using Support Vector Machines on five genre datasets concerning classification accuracy against a baseline feature set. Results show that the presented features provide comparable classification accuracy with respect to other genre classification approaches using periodicity histograms and display a performance close to that of much more elaborate up-to-date approaches for rhythm description. The use of bar boundary annotations for the texture frames has provided an improvement for the dance-oriented Ballroom dataset. The comparably small number of descriptors and the possibility of evaluating the influence of specific signal components to the general rhythmic content encourage the further use of the method in rhythm description tasks
    • 

    corecore