Search CORE

8 research outputs found

Automatic detection of outliers in world music collections

Author: Benetos E
Dixon S
Fourth International Conference on Analytical Approaches to World Music (AAWM 2016)
Panteli M
Publication venue
Publication date: 27/03/2016
Field of study

AI and Tempo Estimation: A Review

Author: Luck Geoff
Publication venue
Publication date: 30/12/2023
Field of study

The author's goal in this paper is to explore how artificial intelligence (AI) has been utilised to inform our understanding of and ability to estimate at scale a critical aspect of musical creativity - musical tempo. The central importance of tempo to musical creativity can be seen in how it is used to express specific emotions (Eerola and Vuoskoski 2013), suggest particular musical styles (Li and Chan 2011), influence perception of expression (Webster and Weir 2005) and mediate the urge to move one's body in time to the music (Burger et al. 2014). Traditional tempo estimation methods typically detect signal periodicities that reflect the underlying rhythmic structure of the music, often using some form of autocorrelation of the amplitude envelope (Lartillot and Toiviainen 2007). Recently, AI-based methods utilising convolutional or recurrent neural networks (CNNs, RNNs) on spectral representations of the audio signal have enjoyed significant improvements in accuracy (Aarabi and Peeters 2022). Common AI-based techniques include those based on probability (e.g., Bayesian approaches, hidden Markov models (HMM)), classification and statistical learning (e.g., support vector machines (SVM)), and artificial neural networks (ANNs) (e.g., self-organising maps (SOMs), CNNs, RNNs, deep learning (DL)). The aim here is to provide an overview of some of the more common AI-based tempo estimation algorithms and to shine a light on notable benefits and potential drawbacks of each. Limitations of AI in this field in general are also considered, as is the capacity for such methods to account for idiosyncrasies inherent in tempo perception, i.e., how well AI-based approaches are able to think and act like humans.Comment: 9 page

arXiv.org e-Print Archive

Recommended from our members

A computational study on outliers in world music

Author: A Flexer
A Holzapfel
A Honingh
A Livshin
A Lomax
A Lomax
B Nettl
B Nettl
BL Sturm
C Guastavino
C Panagiotakis
Chun-Hsi Huang
CM Bishop
CT Lu
D Bountouridis
D Chen
D Clarke
D Schnitzer
DMW Powers
E Gómez
Emmanouil Benetos
F Pachet
G Tzanetakis
G Tzanetakis
G Tzanetakis
H Lee
I Ben-Gal
J Salamon
J Serrà
J Serrà
JJ Aucouturier
JP Bello
JS Downie
JS Downie
JT Titon
L Sun
M Mauch
M Müller
M Schedl
MA Bartsch
MA Schmuckler
Maria Panteli
N Kroher
P Casas
P Filzmoser
P Toiviainen
PE Savage
PE Savage
PE Savage
PJ Rousseeuw
PV Bohlman
R Typke
S Abdallah
S Bhattacharyya
S Brown
S Le Bomin
S McAdams
S Sadie
SC Johnson
SE Trehub
Simon Dixon
T Collins
T Rzeszutek
TH Grubesic
V Hodge
Y Lu
Z Fu
Z Fu
Ò Celma
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2017
Field of study

The comparative analysis of world music cultures has been the focus of several ethnomusicological studies in the last century. With the advances of Music Information Retrieval and the increased accessibility of sound archives, large-scale analysis of world music with computational tools is today feasible. We investigate music similarity in a corpus of 8200 recordings of folk and traditional music from 137 countries around the world. In particular, we aim to identify music recordings that are most distinct compared to the rest of our corpus. We refer to these recordings as ‘outliers’. We use signal processing tools to extract music information from audio recordings, data mining to quantify similarity and detect outliers, and spatial statistics to account for geographical correlation. Our findings suggest that Botswana is the country with the most distinct recordings in the corpus and China is the country with the most distinct recordings when considering spatial correlation. Our analysis includes a comparison of musical attributes and styles that contribute to the ‘uniqueness’ of the music of each country

City Research Online

Crossref

Directory of Open Access Journals

Queen Mary Research Online

Recommended from our members

Roadmap for Music Information ReSearch

Author: Benetos E.
Chudy M.
Dixon S.
Flexer A.
Gomez E.
Gouyon F.
Herrera P.
Jorda S.
Magas M.
Paytuvi O.
Peeters G.
Schlüter J.
Serra X.
Vinet H.
Widmer G.
Publication venue: MIRES Consortium
Publication date: 01/01/2013
Field of study

City Research Online

UPF Digital Repository

Final Research Report on Auto-Tagging of Music

Author: Cohen-Hadria Alice
Cornu Frédéric
Fourer Dominique
Hofmann Robin
Laffitte Pierre
Marchand Ugo
Mignot Rémi
Peeters Geoffroy
Schindler Daniel
Schwarz Diemo
Spadaveccia Rino
Publication venue
Publication date: 12/12/2018
Field of study

The deliverable D4.7 concerns the work achieved by IRCAM until M36 for the “auto-tagging of music”. The deliverable is a research report. The software libraries resulting from the research have been integrated into Fincons/HearDis! Music Library Manager or are used by TU Berlin. The final software libraries are described in D4.5. The research work on auto-tagging has concentrated on four aspects: 1) Further improving IRCAM’s machine-learning system ircamclass. This has been done by developing the new MASSS audio features, including audio augmentation and audio segmentation into ircamclass. The system has then been applied to train HearDis! “soft” features (Vocals-1, Vocals-2, Pop-Appeal, Intensity, Instrumentation, Timbre, Genre, Style). This is described in Part 3. 2) Developing two sets of “hard” features (i.e. related to musical or musicological concepts) as specified by HearDis! (for integration into Fincons/HearDis! Music Library Manager) and TU Berlin (as input for the prediction model of the GMBI attributes). Such features are either derived from previously estimated higher-level concepts (such as structure, key or succession of chords) or by developing new signal processing algorithm (such as HPSS) or main melody estimation. This is described in Part 4. 3) Developing audio features to characterize the audio quality of a music track. The goal is to describe the quality of the audio independently of its apparent encoding. This is then used to estimate audio degradation or music decade. This is to be used to ensure that playlists contain tracks with similar audio quality. This is described in Part 5. 4) Developing innovative algorithms to extract specific audio features to improve music mixes. So far, innovative techniques (based on various Blind Audio Source Separation algorithms and Convolutional Neural Network) have been developed for singing voice separation, singing voice segmentation, music structure boundaries estimation, and DJ cue-region estimation. This is described in Part 6.EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC D

DepositOnce

Αυτόματη Ανάκτηση Μουσικής Πληροφορίας με Έμφαση στο Ρυθμό

Author: Γκιόκας Άγγελος
Publication venue
Publication date: 07/03/2017
Field of study

DSpace at NTUA

Scale transform in rhythmic similarity of music

Author: Holzapfel André
Stylianou Yannis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

As a special case of the Mellin transform, the scale transform has been applied in various signal processing areas, in order to get a signal description that is invariant to scale changes. In this paper, the scale transform is applied to autocorrelation sequences derived from music signals. It is shown that two such sequences, when derived from similar rhythms with different tempo, differ mainly by a scaling factor. By using the scale transform, the proposed descriptors are robust to tempo changes, and are specially suited for the comparison of pieces with different tempi but similar rhythm. As music with such characteristics is widely encountered in traditional forms of music, the performance of the descriptors in a classification task of Greek traditional dances and Turkish traditional songs is evaluated. On these datasets accuracies compared to non-tempo robust approaches are improved by more than 20%, while on a dataset of Western music the achieved accuracy improves compared to previously presented results.QC 20161031</p

Publikationer från KTH

CiteSeerX

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line