Search CORE

901 research outputs found

Features for the classification and clustering of music in symbolic format

Author: Bernardo Alexandre Miguel Entradas
Publication venue
Publication date: 01/01/2008
Field of study

Tese de mestrado, Engenharia Informática, Universidade de Lisboa, Faculdade de Ciências, 2008Este documento descreve o trabalho realizado no âmbito da disciplina de Projecto em Engenharia Informática do Mestrado em Engenharia Informática da Faculdade de Ciências da Universidade de Lisboa. Recuperação de Informação Musical é, hoje em dia, um ramo altamente activo de investigação e desenvolvimento na área de ciência da computação, e incide em diversos tópicos, incluindo a classificação musical por géneros. O trabalho apresentado centra-se na Classificação de Pistas e de Géneros de música armazenada usando o formato MIDI. Para resolver o problema da classificação de pistas MIDI, extraimos um conjunto de descritores que são usados para treinar um classificador implementado através de uma técnica de Máquinas de Aprendizagem, Redes Neuronais, com base nas notas, e durações destas, que descrevem cada faixa. As faixas são classificadas em seis categorias: Melody (Melodia), Harmony (Harmonia), Bass (Baixo) e Drums (Bateria). Para caracterizar o conteúdo musical de cada faixa, um vector de descritores numérico, normalmente conhecido como ”shallow structure description”, é extraído. Em seguida, eles são utilizados no classificador — Neural Network — que foi implementado no ambiente Matlab. Na Classificação por Géneros, duas propostas foram usadas: Modelação de Linguagem, na qual uma matriz de transição de probabilidades é criada para cada tipo de pista midi (Melodia, Harmonia, Baixo e Bateria) e também para cada género; e Redes Neuronais, em que um vector de descritores numéricos é extraído de cada pista, e é processado num Classificador baseado numa Rede Neuronal. Seis Colectâneas de Musica no formato Midi, de seis géneros diferentes, Blues, Country, Jazz, Metal, Punk e Rock, foram formadas para efectuar as experiências. Estes géneros foram escolhidos por partilharem os mesmos instrumentos, na sua maioria, como por exemplo, baixo, bateria, piano ou guitarra. Estes géneros também partilham algumas características entre si, para que a classificação não seja trivial, e para que a robustez dos classificadores seja testada. As experiências de Classificação de Pistas Midi, nas quais foram testados, numa primeira abordagem, todos os descritores, e numa segunda abordagem, os melhores descritores, mostrando que o uso de todos os descritores é uma abordagem errada, uma vez que existem descritores que confundem o classificador. Provou-se que a melhor maneira, neste contexto, de se classificar estas faixas MIDI é utilizar descritores cuidadosamente seleccionados. As experiências de Classificação por Géneros, mostraram que os Classificadores por Instrumentos (Single-Instrument) obtiveram os melhores resultados. Quatro géneros, Jazz, Country, Metal e Punk, obtiveram resultados de classificação com sucesso acima dos 80% O trabalho futuro inclui: algoritmos genéticos para a selecção de melhores descritores; estruturar pistas e musicas; fundir todos os classificadores desenvolvidos num único classificador.This document describes the work carried out under the discipline of Computing Engineering Project of the Computer Engineering Master, Sciences Faculty of the Lisbon University. Music Information Retrieval is, nowadays, a highly active branch of research and development in the computer science field, and focuses several topics, including music genre classification. The work presented in this paper focus on Track and Genre Classification of music stored using MIDI format, To address the problem of MIDI track classification, we extract a set of descriptors that are used to train a classifier implemented by a Neural Network, based on the pitch levels and durations that describe each track. Tracks are classified into four classes: Melody, Harmony, Bass and Drums. In order to characterize the musical content from each track, a vector of numeric descriptors, normally known as shallow structure description, is extracted. Then they are used as inputs for the classifier which was implemented in the Matlab environment. In the Genre Classification task, two approaches are used: Language Modeling, in which a transition probabilities matrix is created for each type of track (Melody, Harmony, Bass and Drums) and also for each genre; and an approach based on Neural Networks, where a vector of numeric descriptors is extracted from each track (Melody, Harmony, Bass and Drums) and fed to a Neural Network Classifier. Six MIDI Music Corpora were assembled for the experiments, from six different genres, Blues, Country, Jazz, Metal, Punk and Rock. These genres were selected because all of them have the same base instruments, such as bass, drums, piano or guitar. Also, the genres chosen share some characteristics between them, so that the classification isn’t trivial, and tests the classifiers robustness. Track Classification experiments using all descriptors and best descriptors were made, showing that using all descriptors is a wrong approach, as there are descriptors which confuse the classifier. Using carefully selected descriptors proved to be the best way to classify these MIDI tracks. Genre Classification experiments showed that the Single-Instrument Classifiers achieved the best results. Four genres achieved higher than 80% success rates: Jazz, Country, Metal and Punk. Future work includes: genetic algorithms; structurize tracks and songs; merge all presented classifiers into one full Automatic Genre Classification System

Universidade de Lisboa: Repositório.UL

Automatic musical instrument recognition for multimedia indexing

Author: Malheiro Frederico Alberto Santos de Carteado
Publication venue: Faculdade de Ciências e Tecnologia
Publication date: 01/01/2011
Field of study

Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia InformáticaThe subject of automatic indexing of multimedia has been a target of numerous discussion and study. This interest is due to the exponential growth of multimedia content and the subsequent need to create methods that automatically catalogue this data. To fulfil this idea, several projects and areas of study have emerged. The most relevant of these are the MPEG-7 standard, which defines a standardized system for the representation and automatic extraction of information present in the content, and Music Information Retrieval (MIR), which gathers several paradigms and areas of study relating to music. The main approach to this indexing problem relies on analysing data to obtain and identify descriptors that can help define what we intend to recognize (as, for instance,musical instruments, voice, facial expressions, and so on), this then provides us with information we can use to index the data. This dissertation will focus on audio indexing in music, specifically regarding the recognition of musical instruments from recorded musical notes. Moreover, the developed system and techniques will also be tested for the recognition of ambient sounds (such as the sound of running water, cars driving by, and so on). Our approach will use non-negative matrix factorization to extract features from various types of sounds, these will then be used to train a classification algorithm that will be then capable of identifying new sounds

Repositório da Universidade Nova de Lisboa

Augmentation Methods on Monophonic Audio for Instrument Classification in Polyphonic Music

Author: bittner
bosch
diment
gururani
gururani
heittola
hung
kitahara
lee
mcfee
mcfee
pati
rafael aguilar
schoerkhuber
simonyan
taenzer
thickstun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/03/2020
Field of study

Instrument classification is one of the fields in Music Information Retrieval (MIR) that has attracted a lot of research interest. However, the majority of that is dealing with monophonic music, while efforts on polyphonic material mainly focus on predominant instrument recognition. In this paper, we propose an approach for instrument classification in polyphonic music from purely monophonic data, that involves performing data augmentation by mixing different audio segments. A variety of data augmentation techniques focusing on different sonic aspects, such as overlaying audio segments of the same genre, as well as pitch and tempo-based synchronization, are explored. We utilize Convolutional Neural Networks for the classification task, comparing shallow to deep network architectures. We further investigate the usage of a combination of the above classifiers, each trained on a single augmented dataset. An ensemble of VGG-like classifiers, trained on non-augmented, pitch-synchronized, tempo-synchronized and genre-similar excerpts, respectively, yields the best results, achieving slightly above 80% in terms of label ranking average precision (LRAP) in the IRMAS test set.ruments in over 2300 testing tracks

arXiv.org e-Print Archive

Crossref

Recommended from our members

A database and challenge for acoustic scene classification and event detection

Author: Benetos E.
Giannoulis D.
Lagrange M.
Plumbley M. D.
Rossignol M.
Stowell D.
Publication venue
Publication date: 01/01/2013
Field of study

City Research Online

Playing with Cases: Rendering Expressive Music with Case-Based Reasoning

Author: Arcos Rosell Josep Lluís
López de Mántaras Ramón
Publication venue: American Association for Artificial Intelligence
Publication date: 06/10/2016
Field of study

This article surveys long-term research on the problem of rendering expressive music by means of AI techniques with an emphasis on case-based reasoning (CBR). Following a brief overview discussing why people prefer listening to expressive music instead of nonexpressive synthesized music, we examine a representative selection of well-known approaches to expressive computer,music performance with an emphasis on AI-related approaches. In the main part of the article we focus on the existing CBR approaches to the problem of synthesizing expressive music, and particularly on Tempo-Express, a case-based reasoning system developed at our Institute, for applying musically acceptable tempo transformations to monophonic audio recordings of musical performances. Finally we briefly describe an ongoing extension of our previous work consisting of complementing audio information with information about the gestures of the musician. Music is played through our bodies, therefore capturing the gesture of the performer is a fundamental aspect that has to be taken into account in future expressive music renderings. This article is based on the >2011 Robert S. Engelmore Memorial Lecture> given by the first author at AAAI/IAAI 2011.This research is partially supported by the Ministry of Science and Innovation of Spain under the project NEXT-CBR (TIN2009-13692-C03-01) and the Generalitat de Catalunya AGAUR Grant 2009-SGR-1434Peer Reviewe

Digital.CSIC

An Exploration of Monophonic Instrument Classification Using Multi-Threaded Artificial Neural Networks

Author: Rubin Marc Joseph
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2009
Field of study

The use of computers for automated music analysis could benefit several aspects of academia and industry, from psychological and music research, to intelligent music selection and music copyright investigation. In the following thesis, one of the first steps of automated musical analysis, i.e., monophonic instrument recognition, was explored. A multi-threaded artificial neural network was implemented and used as the classifier in order to utilize multi-core technology and allow for faster training. The parallelized batch-mode backpropagation algorithm used provided linear speedup, an improvement to the current literature. For the classification experiments, eleven different sets of instruments were used, starting with perceptively dissimilar instruments (i.e., bass vs. trumpet), moving towards more similar sounding instruments (i.e., violin vs. viola; oboe vs. bassoon; xylophone vs. vibraphone, etc.,). From the 70 original musical features extracted from each audio sample, a sequential forward selection algorithm was employed to select only the most salient features that best differentiate the instruments in question. Using twenty runs for each set of instruments (i.e., 10 sets of a 50/50 cross-validation training paradigm), the test results were promising, with classification rates ranging from a mean of 76% to 96%, with many individual runs reaching a perfect 100% score. The conclusion of this thesis confirms the use of multi-threaded artificial neural networks as a viable classifier in single instrument recognition of perceptively similar sounding instruments

University of Tennessee, Knoxville: Trace

Acoustic features of piano sounds

Author: Karatsovis Christos
Publication venue
Publication date: 01/11/2011
Field of study

To date efforts of music transcription indicate the need for modelling the data signal in a more comprehensive manner in order to improve the transcription process of music performances. This research work is concerned with the investigation of two features associated with the reproduced sound of a piano; the inharmonicity factor of the piano strings and the double decay rate of the resulting sound. Firstly, a simple model of the inharmonicity is proposed and the factors that affect the modelled signal are identified, such as the magnitude of the inharmonicity, the number of harmonics, the time parameter, the phase characteristics and the harmonic amplitudes. A formation of a socalled “one-sided” effect appears in simulated signals, although this effect is obscured in real recordings potentially due to the non-uniformly varying amplitudes of the harmonic terms. This effect is also discussed through the use of the cepstrum by analysing real piano note recordings and synthesized signals. The cepstrum is further used to describe the effect of the coupled behaviour of two strings through digital waveguides. Secondly, the double decay rate effect is modelled through coupled oscillators and digital waveguides. A physical model of multiple strings is also presented as an extension to the simple model of coupled oscillators and various measurements on a real grand piano are carried out in order to investigate the coupling mechanism between the strings, the soundboard and the bridge. Finally, a model, with reduced dimensionality, is proposed to represent the signal model for single and multiple notes formulated around a Bayesian framework. The potential of such a model is illustrated with the transcription of simple examples of real monophonic and polyphonic piano recordings by implementing the Metropolis-Hastings algorithm and Gibbs sampler for multivariate parameter estimation

Southampton (e-Prints Soton)