Search CORE

1,533,362 research outputs found

Audio Inpainting

Author: Adler A
Elad M
Emiya V
Gribonval R
Jafari MG
Plumbley MD
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2012
Field of study

(c) 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Published version: IEEE Transactions on Audio, Speech and Language Processing 20(3): 922-932, Mar 2012. DOI: 10.1090/TASL.2011.2168211

HAL-CentraleSupelec

CiteSeerX

Crossref

Portail HAL Nantes Université

INRIA a CCSD electronic archive server

HAL Descartes

Queen Mary Research Online

Surrey Research Insight

HAL: Hyper Article en Ligne

Hal-Diderot

Portail HAL UNIV-RENNES

Virtual Audio - Three-Dimensional Audio in Virtual Environments

Author: Adler Daniel
Publication venue: Swedish Institute of Computer Science
Publication date: 01/01/1996
Field of study

Three-dimensional interactive audio has a variety ofpotential uses in human-machine interfaces. After lagging seriously behind the visual components, the importance of sound is now becoming increas-ingly accepted. This paper mainly discusses background and techniques to implement three-dimensional audio in computer interfaces. A case study of a system for three-dimensional audio, implemented by the author, is described in great detail. The audio system was moreover integrated with a virtual reality system and conclusions on user tests and use of the audio system is presented along with proposals for future work at the end of the paper. The thesis begins with a definition of three-dimensional audio and a survey on the human auditory system to give the reader the needed knowledge of what three-dimensional audio is and how human auditory perception works

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

An open dataset for research on audio field recording archives: freefield1010

Author: Plumbley Mark D.
Stowell Dan
Publication venue
Publication date: 01/10/2013
Field of study

We introduce a free and open dataset of 7690 audio clips sampled from the field-recording tag in the Freesound audio archive. The dataset is designed for use in research related to data mining in audio archives of field recordings / soundscapes. Audio is standardised, and audio and metadata are Creative Commons licensed. We describe the data preparation process, characterise the dataset descriptively, and illustrate its use through an auto-tagging experiment

arXiv.org e-Print Archive

CiteSeerX

University of Surrey

Surrey Research Insight

Deep Learning of Human Perception in Audio Event Classification

Author: Beuret Samuel
Oyama Keizo
Yu Yi
Zeng Donghuo
Publication venue
Publication date: 03/09/2018
Field of study

In this paper, we introduce our recent studies on human perception in audio event classification by different deep learning models. In particular, the pre-trained model VGGish is used as feature extractor to process audio data, and DenseNet is trained by and used as feature extractor for our electroencephalography (EEG) data. The correlation between audio stimuli and EEG is learned in a shared space. In the experiments, we record brain activities (EEG signals) of several subjects while they are listening to music events of 8 audio categories selected from Google AudioSet, using a 16-channel EEG headset with active electrodes. Our experimental results demonstrate that i) audio event classification can be improved by exploiting the power of human perception, and ii) the correlation between audio stimuli and EEG can be learned to complement audio event understanding

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

General audio tagging with ensembling convolutional neural network and statistical features

Author: Ding Bo
Kong Qiuqiang
Mi Haibo
Wang Dezhi
Wang Huaimin
Xu Kele
Zhu Boqing
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 30/10/2018
Field of study

Audio tagging aims to infer descriptive labels from audio clips. Audio tagging is challenging due to the limited size of data and noisy labels. In this paper, we describe our solution for the DCASE 2018 Task 2 general audio tagging challenge. The contributions of our solution include: We investigated a variety of convolutional neural network architectures to solve the audio tagging task. Statistical features are applied to capture statistical patterns of audio features to improve the classification performance. Ensemble learning is applied to ensemble the outputs from the deep classifiers to utilize complementary information. a sample re-weight strategy is employed for ensemble training to address the noisy label problem. Our system achieves a mean average precision (mAP@3) of 0.958, outperforming the baseline system of 0.704. Our system ranked the 1st and 4th out of 558 submissions in the public and private leaderboard of DCASE 2018 Task 2 challenge. Our codes are available at https://github.com/Cocoxili/DCASE2018Task2/.Comment: Submitted to ICASS

arXiv.org e-Print Archive

Crossref