234 research outputs found
PENERAPAN JARINGAN SYARAF TIRUAN DENGAN RADIAL BASIS FUNCTION UNTUK PENGENALAN GENRE MUSIK
Kecerdasan buatan dapat diaplikasikan
dalam banyak bidang dalam kehidupan. Penerapan
kecerdasan buatan diantaranya dapat dicapai dengan
pendekatan jaringan syaraf tiruan (JST). Salah satu
contoh metode jaringan syaraf tiruan yang dikenal
adalah metode radial basis function (RBF).
Jaringan syaraf tiruan radial basis function (JST
RBF) dikenal sebagai salah satu jaringan syaraf
yang memiliki tiga lapis bersifat feedforward yang
dapat memecahkan masalah klasifikasi atau
pengenalan pola. Dalam penelitian ini JST RBF
digunakan untuk menglasifikasi musik ke dalam
genre (jenis) musik berdasarkan kedekatannya
dengan target. Sebagai kebutuhan, jenis musik yang
dipakai pada penelitian ini adalah campursari,
keroncong, pop, dan rock dengan 3 macam durasi
yaitu 2 detik, 5 detik, dan 10 detik pada setiap
musik. Sedangkan banyak neuron yang dapakai
dalam lapisan tersembunyi sebanyak 56 neuron.
Bahan masukan (input) yang digunakan dalam JST
RBF ini berformat *.mp3 yang diunduh dari internet
yang selanjutnya dikonversi ke dalam format *.wav
dan diektraksi dengan menggunakan mel-frequency
cepstrum coeffisients (MFCC). Teknik ini
mengekstraksi fitur suara yang terdapat pada data
musik. Koefisien yang digunakan dalam penelitian
ini sebanyak 7 koefisien untuk setiap data musik.
Dari hasil simulasi program menunjukkan bahwa
JST RBF dapat mengklasifikasi musik dengan
akurasi paling tinggi pada data uji berdurasi 10
detik sebesar 75%.
Kata kunci : Genre, jaringan syaraf tiruan,
kecerdasan buatan, mel-frequency cepstrum
coefficients, musik, radial basis function
Masked Conditional Neural Networks for sound classification
The remarkable success of deep convolutional neural networks in image-related applications has led to their adoption also for sound processing. Typically the input is a time–frequency representation such as a spectrogram, and in some cases this is treated as a two-dimensional image. However, spectrogram properties are very different to those of natural images. Instead of an object occupying a contiguous region in a natural image, frequencies of a sound are scattered about the frequency axis of a spectrogram in a pattern unique to that particular sound. Applying conventional convolution neural networks has therefore required extensive hand-tuning, and presented the need to find an architecture better suited to the time–frequency properties of audio. We introduce the ConditionaL Neural Network (CLNN)1 and its extension, the Masked ConditionaL Neural Network (MCLNN) designed to exploit the nature of sound in a time–frequency representation. The CLNN is, broadly speaking, linear across frequencies but non-linear across time: it conditions its inference at a particular time based on preceding and succeeding time slices, and the MCLNN use a controlled systematic sparseness that embeds a filterbank-like behavior within the network. Additionally, the MCLNN automates the concurrent exploration of several feature combinations analogous to hand-crafting the optimum combination of features for a recognition task. We have applied the MCLNN to the problem of music genre classification, and environmental sound recognition on several music (Ballroom, GTZAN, ISMIR2004, and Homburg), and environmental sound (Urbansound8K, ESC-10, and ESC-50) datasets. The classification accuracy of the MCLNN surpasses neural networks based architectures including state-of-the-art Convolutional Neural Networks and several hand-crafted attempts
Sparse machine learning methods with applications in multivariate signal processing
This thesis details theoretical and empirical work that draws from two main subject areas: Machine
Learning (ML) and Digital Signal Processing (DSP). A unified general framework is given for the application
of sparse machine learning methods to multivariate signal processing. In particular, methods that
enforce sparsity will be employed for reasons of computational efficiency, regularisation, and compressibility.
The methods presented can be seen as modular building blocks that can be applied to a variety
of applications. Application specific prior knowledge can be used in various ways, resulting in a flexible
and powerful set of tools. The motivation for the methods is to be able to learn and generalise from a set
of multivariate signals.
In addition to testing on benchmark datasets, a series of empirical evaluations on real world
datasets were carried out. These included: the classification of musical genre from polyphonic audio
files; a study of how the sampling rate in a digital radar can be reduced through the use of Compressed
Sensing (CS); analysis of human perception of different modulations of musical key from
Electroencephalography (EEG) recordings; classification of genre of musical pieces to which a listener
is attending from Magnetoencephalography (MEG) brain recordings. These applications demonstrate
the efficacy of the framework and highlight interesting directions of future research
A Model for Predicting Music Popularity on Streaming Platforms
The global music market moves billions of dollars every year, most of which comes from streamingplatforms. In this paper, we present a model for predicting whether or not a song will appear in Spotify’s Top 50, a ranking of the 50 most popular songs in Spotify, which is one of today’s biggest streaming services. To make this prediction, we trained different classifiers with information from audio features from songs that appeared in this ranking between November 2018 and January 2019. When tested with data from June and July 2019, an SVM classifier with RBF kernel obtained accuracy, precision, and AUC above 80%
- …