1,531,637 research outputs found

    Audio Inpainting

    Get PDF
    (c) 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Published version: IEEE Transactions on Audio, Speech and Language Processing 20(3): 922-932, Mar 2012. DOI: 10.1090/TASL.2011.2168211

    Virtual Audio - Three-Dimensional Audio in Virtual Environments

    Get PDF
    Three-dimensional interactive audio has a variety ofpotential uses in human-machine interfaces. After lagging seriously behind the visual components, the importance of sound is now becoming increas-ingly accepted. This paper mainly discusses background and techniques to implement three-dimensional audio in computer interfaces. A case study of a system for three-dimensional audio, implemented by the author, is described in great detail. The audio system was moreover integrated with a virtual reality system and conclusions on user tests and use of the audio system is presented along with proposals for future work at the end of the paper. The thesis begins with a definition of three-dimensional audio and a survey on the human auditory system to give the reader the needed knowledge of what three-dimensional audio is and how human auditory perception works

    Deep Learning of Human Perception in Audio Event Classification

    Full text link
    In this paper, we introduce our recent studies on human perception in audio event classification by different deep learning models. In particular, the pre-trained model VGGish is used as feature extractor to process audio data, and DenseNet is trained by and used as feature extractor for our electroencephalography (EEG) data. The correlation between audio stimuli and EEG is learned in a shared space. In the experiments, we record brain activities (EEG signals) of several subjects while they are listening to music events of 8 audio categories selected from Google AudioSet, using a 16-channel EEG headset with active electrodes. Our experimental results demonstrate that i) audio event classification can be improved by exploiting the power of human perception, and ii) the correlation between audio stimuli and EEG can be learned to complement audio event understanding

    Weakly Labelled AudioSet Tagging with Attention Neural Networks

    Full text link
    Audio tagging is the task of predicting the presence or absence of sound classes within an audio clip. Previous work in audio tagging focused on relatively small datasets limited to recognising a small number of sound classes. We investigate audio tagging on AudioSet, which is a dataset consisting of over 2 million audio clips and 527 classes. AudioSet is weakly labelled, in that only the presence or absence of sound classes is known for each clip, while the onset and offset times are unknown. To address the weakly-labelled audio tagging problem, we propose attention neural networks as a way to attend the most salient parts of an audio clip. We bridge the connection between attention neural networks and multiple instance learning (MIL) methods, and propose decision-level and feature-level attention neural networks for audio tagging. We investigate attention neural networks modeled by different functions, depths and widths. Experiments on AudioSet show that the feature-level attention neural network achieves a state-of-the-art mean average precision (mAP) of 0.369, outperforming the best multiple instance learning (MIL) method of 0.317 and Google's deep neural network baseline of 0.314. In addition, we discover that the audio tagging performance on AudioSet embedding features has a weak correlation with the number of training samples and the quality of labels of each sound class.Comment: 13 page

    MEDIA PEMBELAJARAN FILTER SINYAL AUDIO UNTUK MATA PELAJARAN TEKNIK AUDIO

    Get PDF
    Penelitian ini bertujuan untuk mengetahui desain, unjuk kerja, dan tingkat kelayakan Media Pembelajaran Filter Sinyal Audio sebagai media pembelajaran mata pelajaran teknik audio pada jurusan Teknik Audio Video di SMK Negeri 3 Yogyakarta. Penelitian ini merupakan penelitian Research and Development. Objek penelitian ini adalah Media Pembelajaran Filter Sinyal Audio yang dilengkapi modul pembelajaran. Tahap pengembangan produk meliputi 1). Analisis, 2). Desain, 3). Implementasi, 4). Pengujian, 5). Validasi, dan 6). Ujicoba pemakaian. Metode yang digunakan dalam pengumpulan data meliputi 1). Pengujian dan pengamatan unjuk kerja, 2). Angket penelitian. Adapun validasi media pembelajaran melibatkan dua ahli materi pembelajaran dan dua ahli media pembelajaran dan ujicoba pemakaian dilakukan oleh 33 siswa. Hasil penelitian menunjukkan bahwa unjuk kerja Media Pembelajaran Filter Sinyal Audio sudah sesuai dengan tujuannya sebagai media pembelajaran filter audio. Hasil pengujian rangkaian AFG dapat menghasilkan sinyal keluaran dengan tiga bentuk gelombang yaitu sinus, gigi gergaji dan kotak dengan frekuensi antara 10 Hz–30 KHz. Rangkaian frekuensi counter dapat menghitung frekuensi antara 10 Hz–25 KHz dan dapat membaca amplitudo dengan rentang antara 0,3 Vp-p–10 Vp-p. Masing-masing board rangkaian filter dapat bekerja dengan baik pada rentang frekuensi antara 20 Hz-20 KHz. Hasil validasi isi oleh ahli materi pembelajaran memperoleh tingkat validitas dengan persentase 81,77% dengan kategori sangat layak, validasi konstruk oleh ahli media pembelajaran memperoleh tingkat validitas dengan persentase 87,5% dengan kategori sangat layak. Sedangkan dalam uji pemakaian oleh siswa di SMK N 3 Yogyakarta mendapatkan validitas sebesar 78,5% dengan kategori sangat layak. Kata kunci: media, pembelajaran, filter, sinyal audi
    corecore