5,886 research outputs found
Joint Multi-Pitch Detection Using Harmonic Envelope Estimation for Polyphonic Music Transcription
In this paper, a method for automatic transcription of music signals based on joint multiple-F0 estimation is proposed. As a time-frequency representation, the constant-Q resonator time-frequency image is employed, while a novel noise suppression technique based on pink noise assumption is applied in a preprocessing step. In the multiple-F0 estimation stage, the optimal tuning and inharmonicity parameters are computed and a salience function is proposed in order to select pitch candidates. For each pitch candidate combination, an overlapping partial treatment procedure is used, which is based on a novel spectral envelope estimation procedure for the log-frequency domain, in order to compute the harmonic envelope of candidate pitches. In order to select the optimal pitch combination for each time frame, a score function is proposed which combines spectral and temporal characteristics of the candidate pitches and also aims to suppress harmonic errors. For postprocessing, hidden Markov models (HMMs) and conditional random fields (CRFs) trained on MIDI data are employed, in order to boost transcription accuracy. The system was trained on isolated piano sounds from the MAPS database and was tested on classic and jazz recordings from the RWC database, as well as on recordings from a Disklavier piano. A comparison with several state-of-the-art systems is provided using a variety of error metrics, where encouraging results are indicated
Music Similarity Estimation
Music is a complicated form of communication, where creators and culture communicate and expose their individuality. After music digitalization took place, recommendation systems and other online services have become indispensable in the field of Music Information Retrieval (MIR). To build these systems and recommend the right choice of song to the user, classification of songs is required. In this paper, we propose an approach for finding similarity between music based on mid-level attributes like pitch, midi value corresponding to pitch, interval, contour and duration and applying text based classification techniques. Our system predicts jazz, metal and ragtime for western music. The experiment to predict the genre of music is conducted based on 450 music files and maximum accuracy achieved is 95.8% across different n-grams. We have also analyzed the Indian classical Carnatic music and are classifying them based on its raga. Our system predicts Sankarabharam, Mohanam and Sindhubhairavi ragas. The experiment to predict the raga of the song is conducted based on 95 music files and the maximum accuracy achieved is 90.3% across different n-grams. Performance evaluation is done by using the accuracy score of scikit-learn
AI and Tempo Estimation: A Review
The author's goal in this paper is to explore how artificial intelligence
(AI) has been utilised to inform our understanding of and ability to estimate
at scale a critical aspect of musical creativity - musical tempo. The central
importance of tempo to musical creativity can be seen in how it is used to
express specific emotions (Eerola and Vuoskoski 2013), suggest particular
musical styles (Li and Chan 2011), influence perception of expression (Webster
and Weir 2005) and mediate the urge to move one's body in time to the music
(Burger et al. 2014). Traditional tempo estimation methods typically detect
signal periodicities that reflect the underlying rhythmic structure of the
music, often using some form of autocorrelation of the amplitude envelope
(Lartillot and Toiviainen 2007). Recently, AI-based methods utilising
convolutional or recurrent neural networks (CNNs, RNNs) on spectral
representations of the audio signal have enjoyed significant improvements in
accuracy (Aarabi and Peeters 2022). Common AI-based techniques include those
based on probability (e.g., Bayesian approaches, hidden Markov models (HMM)),
classification and statistical learning (e.g., support vector machines (SVM)),
and artificial neural networks (ANNs) (e.g., self-organising maps (SOMs), CNNs,
RNNs, deep learning (DL)). The aim here is to provide an overview of some of
the more common AI-based tempo estimation algorithms and to shine a light on
notable benefits and potential drawbacks of each. Limitations of AI in this
field in general are also considered, as is the capacity for such methods to
account for idiosyncrasies inherent in tempo perception, i.e., how well
AI-based approaches are able to think and act like humans.Comment: 9 page
- …