5,102 research outputs found
Acoustic Scene Classification
This work was supported by the Centre for Digital Music Platform (grant EP/K009559/1) and a Leadership Fellowship
(EP/G007144/1) both from the United Kingdom Engineering and Physical Sciences Research Council
auDeep: Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks
auDeep is a Python toolkit for deep unsupervised representation learning from
acoustic data. It is based on a recurrent sequence to sequence autoencoder
approach which can learn representations of time series data by taking into
account their temporal dynamics. We provide an extensive command line interface
in addition to a Python API for users and developers, both of which are
comprehensively documented and publicly available at
https://github.com/auDeep/auDeep. Experimental results indicate that auDeep
features are competitive with state-of-the art audio classification
Recommended from our members
A database and challenge for acoustic scene classification and event detection
Brain-mediated Transfer Learning of Convolutional Neural Networks
The human brain can effectively learn a new task from a small number of
samples, which indicate that the brain can transfer its prior knowledge to
solve tasks in different domains. This function is analogous to transfer
learning (TL) in the field of machine learning. TL uses a well-trained feature
space in a specific task domain to improve performance in new tasks with
insufficient training data. TL with rich feature representations, such as
features of convolutional neural networks (CNNs), shows high generalization
ability across different task domains. However, such TL is still insufficient
in making machine learning attain generalization ability comparable to that of
the human brain. To examine if the internal representation of the brain could
be used to achieve more efficient TL, we introduce a method for TL mediated by
human brains. Our method transforms feature representations of audiovisual
inputs in CNNs into those in activation patterns of individual brains via their
association learned ahead using measured brain responses. Then, to estimate
labels reflecting human cognition and behavior induced by the audiovisual
inputs, the transformed representations are used for TL. We demonstrate that
our brain-mediated TL (BTL) shows higher performance in the label estimation
than the standard TL. In addition, we illustrate that the estimations mediated
by different brains vary from brain to brain, and the variability reflects the
individual variability in perception. Thus, our BTL provides a framework to
improve the generalization ability of machine-learning feature representations
and enable machine learning to estimate human-like cognition and behavior,
including individual variability
- …