5,102 research outputs found

    Acoustic Scene Classification

    Get PDF
    This work was supported by the Centre for Digital Music Platform (grant EP/K009559/1) and a Leadership Fellowship (EP/G007144/1) both from the United Kingdom Engineering and Physical Sciences Research Council

    auDeep: Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks

    Get PDF
    auDeep is a Python toolkit for deep unsupervised representation learning from acoustic data. It is based on a recurrent sequence to sequence autoencoder approach which can learn representations of time series data by taking into account their temporal dynamics. We provide an extensive command line interface in addition to a Python API for users and developers, both of which are comprehensively documented and publicly available at https://github.com/auDeep/auDeep. Experimental results indicate that auDeep features are competitive with state-of-the art audio classification

    Brain-mediated Transfer Learning of Convolutional Neural Networks

    Full text link
    The human brain can effectively learn a new task from a small number of samples, which indicate that the brain can transfer its prior knowledge to solve tasks in different domains. This function is analogous to transfer learning (TL) in the field of machine learning. TL uses a well-trained feature space in a specific task domain to improve performance in new tasks with insufficient training data. TL with rich feature representations, such as features of convolutional neural networks (CNNs), shows high generalization ability across different task domains. However, such TL is still insufficient in making machine learning attain generalization ability comparable to that of the human brain. To examine if the internal representation of the brain could be used to achieve more efficient TL, we introduce a method for TL mediated by human brains. Our method transforms feature representations of audiovisual inputs in CNNs into those in activation patterns of individual brains via their association learned ahead using measured brain responses. Then, to estimate labels reflecting human cognition and behavior induced by the audiovisual inputs, the transformed representations are used for TL. We demonstrate that our brain-mediated TL (BTL) shows higher performance in the label estimation than the standard TL. In addition, we illustrate that the estimations mediated by different brains vary from brain to brain, and the variability reflects the individual variability in perception. Thus, our BTL provides a framework to improve the generalization ability of machine-learning feature representations and enable machine learning to estimate human-like cognition and behavior, including individual variability
    • …
    corecore