Search CORE

15 research outputs found

The influence of sampling frequency on tone recognition of musical instruments

Author: Adi Kuntoro
Sumarno Linggo
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/02/2019
Field of study

Sampling frequency of musical instruments tone recognition generally follows the Shannon sampling theorem. This paper explores the influence of sampling frequency that does not follow the Shannon sampling theorem, in the tone recognition system using segment averaging for feature extraction and template matching for classification. The musical instruments we used were bellyra, flute, and pianica, where each of them represented a musical instrument that had one, a few, and many significant local peaks in the Discrete Fourier Transform (DFT) domain. Based on our experiments, until the sampling frequency is as low as 312 Hz, recognition rate performance of bellyra and flute tones were influenced a little since it reduced in the range of 5%. However, recognition rate performance of pianica tones was not influenced by that sampling frequency. Therefore, if that kind of reduced recognition rate could be accepted, the sampling frequency as low as 312 Hz could be used for tone recognition of musical instruments

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Multipitch tracking in music signals using Echo State Networks

Author: boersma
jaeger
raffel
smaragdis
talkin
thickstun
yeh
Publication venue
Publication date: 01/01/2020
Field of study

Currently, convolutional neural networks (CNNs) define the state of the art for multipitch tracking in music signals. Echo State Networks (ESNs), a recently introduced recurrent neural network architecture, achieved similar results as CNNs for various tasks, such as phoneme or digit recognition. However, they have not yet received much attention in the community of Music Information Retrieval. The core of ESNs is a group of unordered, randomly connected neurons, i.e., the reservoir, by which the low-dimensional input space is non-linearly transformed into a high-dimensional feature space. Because only the weights of the connections between the reservoir and the output are trained using linear regression, ESNs are easier to train than deep neural networks. This paper presents a first exploration of ESNs for the challenging task of multipitch tracking in music signals. The best results presented in this paper were achieved with a bidirectional two-layer ESN with 20 000 neurons in each layer. Although the final F -score of 0.7198 still falls below the state of the art (0.7370), the proposed ESN-based approach serves as a baseline for further investigations of ESNs in audio signal processing in the future

Crossref

Ghent University Academic Bibliography

Dct based feature extraction and support vector machine classification for musical instruments tone recognition

Author: Chai R
Sumarno L
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 11/04/2022
Field of study

The conducted research proposes a feature extraction and classification combination method that is used in a tone recognition system for musical instruments. It is expected that by implementing this combination, the tone recognition system will require fewer feature extraction coefficients than those previously investigated. The proposed combination comprises of feature extraction using discrete cosine transform (DCT) and classification using support vector machine (SVM). Bellyra, clarinet, and pianica tones were used in the experiment, with each indicating a tone with one, several, or many major local peaks in the transform domain. Based on the results of the tests, the proposed combination is efficient enough to be used in a tone recognition system for musical instruments. This is indicated in recognizing a tone, it only needs at least eight feature extraction coefficients

OPUS - University of Technology Sydney

Recommended from our members

Multiple-instrument polyphonic music transcription using a temporally constrained shift-invariant model

Author: Bay M.
Benetos E.
Benetos E.
Benetos E.
de Cheveigné A.
Dempster A. P.
Dessein A.
Dixon S.
Emmanouil Benetos
Fuentes B.
Goto M.
Lee C.-T.
Nakano M.
Nakano M.
Pertusa A.
Poliner G.
Ryynänen M.
Simon Dixon
Smaragdis P.
Smaragdis P.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/03/2013
Field of study

A method for automatic transcription of polyphonic music is proposed in this work that models the temporal evolution of musical tones. The model extends the shift-invariant probabilistic latent component analysis method by supporting the use of spectral templates that correspond to sound states such as attack, sustain, and decay. The order of these templates is controlled using hidden Markov model-based temporal constraints. In addition, the model can exploit multiple templates per pitch and instrument source. The shift-invariant aspect of the model makes it suitable for music signals that exhibit frequency modulations or tuning changes. Pitch-wise hidden Markov models are also utilized in a postprocessing step for note tracking. For training, sound state templates were extracted for various orchestral instruments using isolated note samples. The proposed transcription system was tested on multiple-instrument recordings from various datasets. Experimental results show that the proposed model is superior to a non-temporally constrained model and also outperforms various state-of-the-art transcription systems for the same experiment

City Research Online

Crossref

Musical note analysis of solo violin recordings using recursive regularization

Author: Alvin WY Su
C Yeh
CT Lee
DD Lee
E Vincent
E Vincent
F Nesta
J Mairal
M Nakano
M Unser
N Bertin
P Smaragdis
P Smaragdis
PO Hoyer
R Hennequin
R Hennequin
S Ewert
SP Kim
T Virtanen
Ta-Chun Chen
Tien-Ming Wang
TM Wang
WC Chang
Wei-Chen Chang
Yi-Ju Lin
Yin-Lin Chen
YS Siao
Z Duan
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Table 5: Effect of number of hidden layers on transcription.

Author: Barbancho
Bello
Benetos
Benetos
Benetos
Bengio
Bengio
Bergstra
Boulanger-Lewandowski
Burlet
Burlet
Burlet
Costantini
Courbariaux
Dessein
Dixon
Hainsworth
Heijink
Hinton
Hinton
Hinton
Huang
Humphrey
Humphrey
Humphrey
Klapuri
Klapuri
Klapuri
Lee
Maher
Marolt
Martin
Moorer
Nam
Poliner
Radicioni
Radisavljevic
Raphael
Ryynänen
Sigtia
Singer
Smaragdis
Tang
Tuohy
Tuohy
Tzanetakis
Utgoff
Yeh
Zhang
Zhou
Zhou
Publication venue: 'PeerJ'
Publication date
Field of study

Crossref

Real-time detection of overlapping sound events with non-negative matrix factorization

Author: Cont Arshia
Dessein Arnaud
Lemaitre Guillaume
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/08/2012
Field of study

International audienceIn this paper, we investigate the problem of real-time detection of overlapping sound events by employing non-negative matrix factorization techniques. We consider a setup where audio streams arrive in real-time to the system and are decomposed onto a dictionary of event templates learned off-line prior to the decomposition. An important drawback of existing approaches in this context is the lack of controls on the decomposition. We propose and compare two provably convergent algorithms that address this issue, by controlling respectively the sparsity of the decomposition and the trade-off of the decomposition between the different frequency components. Sparsity regularization is considered in the framework of convex quadratic programming, while frequency compromise is introduced by employing the beta-divergence as a cost function. The two algorithms are evaluated on the multi-source detection tasks of polyphonic music transcription, drum transcription and environmental sound recognition. The obtained results show how the proposed approaches can improve detection in such applications, while maintaining low computational costs that are suitable for real-time

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Multiple Fundamental Frequency Estimation and Polyphony Inference of Polyphonic Music Signals

Author: Rodet Xavier
Roebel Axel
Yeh Chunghsin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

cote interne IRCAM: Yeh10aNone / NoneNational audienceThis article presents a frame-based system for estimating multiple fundamental frequencies (F0s) of polyphonic music signals based on the STFT (short-time Fourier transform) representation. To estimate the number of sources along with their F0s, it is proposed to estimate the noise level beforehand and then jointly evaluate all the possible combinations among pre-selected F0 candidates. Given a set of F0 hypotheses, their hypothetical partial sequences are derived, taking into account where partial overlap may occur. A score function is used to select the plausible sets of F0 hypotheses. To infer the best combination, hypothetical sources are progressively combined and iteratively verified. A hypothetical source is considered valid if it either explains more energy than the noise, or improves significantly the envelope smoothness once the overlapping partials are treated. The proposed system has been submitted to MIREX (Music Information Retrieval Evaluation eXchange) 2007 and 2008 contests where the accuracy has been evaluated with respect to the number of sources inferred and the precision of the F0s estimated. The encouraging results demonstrate its competitive performance among the state-of-the-art methods

HAL Descartes

Hal-Diderot

Automatic transcription of music using deep learning techniques

Author: Gil André Ferreira
Publication venue
Publication date: 21/05/2019
Field of study

Music transcription is the problem of detecting notes that are being played in a musical piece. This is a difficult task that only trained people are capable of doing. Due to its difficulty, there have been a high interest in automate it. However, automatic music transcription encompasses several fields of research such as, digital signal processing, machine learning, music theory and cognition, pitch perception and psychoacoustics. All of this, makes automatic music transcription an hard problem to solve. In this work we present a novel approach of automatically transcribing piano musical pieces using deep learning techniques. We take advantage of deep learning techniques to build several classifiers, each one responsible for detecting only one musical note. In theory, this division of work would enhance the ability of each classifier to transcribe. Apart from that, we also apply two additional stages, pre-processing and post-processing, to improve the efficiency of our system. The pre-processing stage aims at improving the quality of the input data before the classification/transcription stage, while the post-processing aims at fixing errors originated during the classification stage. In the initial steps, preliminary experiments have been performed to fine tune our model, in both three stages: pre-processing, classification and post-processing. The experimental setup, using those optimized techniques and parameters, is shown and a comparison is given with other two state-of-the-art works that apply the same dataset as well as the same deep learning technique but using a different approach. By different approach we mean that a single neural network is used to detect all the musical notes rather than one neural network per each note. Our approach was able to surpass in frame-based metrics these works, while reaching close results in onset-based metrics, demonstrating the feasability of our approach

IC-online