2,321 research outputs found

    Estimating Classification Accuracy for Unlabeled Datasets Based on Block Scaling

    Get PDF
    This paper proposes an approach called block scaling quality (BSQ) for estimating the prediction accuracy of a deep network model. The basic operation perturbs the input spectrogram by multiplying all values within a block by , where  is equal to 0 in the experiments. The ratio of perturbed spectrograms that have different prediction labels than the original spectrogram to the total number of perturbed spectrograms indicates how much of the spectrogram is crucial for the prediction. Thus, this ratio is inversely correlated with the accuracy of the dataset. The BSQ approach demonstrates satisfactory estimation accuracy in experiments when compared with various other approaches. When using only the Jamendo and FMA datasets, the estimation accuracy experiences an average error of 4.9% and 1.8%, respectively. Moreover, the BSQ approach holds advantages over some of the comparison counterparts. Overall, it presents a promising approach for estimating the accuracy of a deep network model

    16th Sound and Music Computing Conference SMC 2019 (28–31 May 2019, Malaga, Spain)

    Get PDF
    The 16th Sound and Music Computing Conference (SMC 2019) took place in Malaga, Spain, 28-31 May 2019 and it was organized by the Application of Information and Communication Technologies Research group (ATIC) of the University of Malaga (UMA). The SMC 2019 associated Summer School took place 25-28 May 2019. The First International Day of Women in Inclusive Engineering, Sound and Music Computing Research (WiSMC 2019) took place on 28 May 2019. The SMC 2019 TOPICS OF INTEREST included a wide selection of topics related to acoustics, psychoacoustics, music, technology for music, audio analysis, musicology, sonification, music games, machine learning, serious games, immersive audio, sound synthesis, etc

    Classification of Speaking and Singing Voices Using Bioimpedance Measurements and Deep Learning

    Get PDF
    The acts of speaking and singing are different phenomena displaying distinct characteristics. The classification and distinction of these voice acts is vastly approached utilizing voice audio recordings and microphones. The use of audio recordings, however, can become challenging and computationally expensive due to the complexity of the voice signal. The research presented in this paper seeks to address this issue by implementing a deep learning classifier of speaking and singing voices based on bioimpedance measurement in replacement of audio recordings. In addition, the proposed research aims to develop a real-time voice act classification for the integration with voice-to-MIDI conversion. For such purposes, a system was designed, implemented, and tested using electroglottographic signals, Mel Frequency Cepstral Coefficients, and a deep neural network. The lack of datasets for the training of the model was tackled by creating a dedicated dataset 7200 bioimpedance measurement of both singing and speaking. The use of bioimpedance measurements allows to deliver high classification accuracy whilst keeping low computational needs for both preprocessing and classification. These characteristics, in turn, allows a fast deployment of the system for near-real-time applications. After the training, the system was broadly tested achieving a testing accuracy of 92% to 94%
    corecore