23,508 research outputs found

    Overview of the ImageCLEFmed 2019 concept detection task

    Get PDF
    This paper describes the ImageCLEF 2019 Concept Detection Task. This is the 3rd edition of the medical caption task, after it was first proposed in ImageCLEF 2017. Concept detection from medical images remains a challenging task. In 2019, the format changed to a single subtask and it is part of the medical tasks, alongside the tuberculosis and visual question and answering tasks. To reduce noisy labels and limit variety, the data set focuses solely on radiology images rather than biomedical figures, extracted from the biomedical open access literature (PubMed Central). The development data consists of 56,629 training and 14,157 validation images, with corresponding Unified Medical Language System (UMLSR) concepts, extracted from the image captions. In 2019 the participation is higher, regarding the number of participating teams as well as the number of submitted runs. Several approaches were used by the teams, mostly deep learning techniques. Long short-term memory (LSTM) recurrent neural networks (RNN), adversarial auto-encoder, convolutional neural networks (CNN) image encoders and transfer learning-based multi-label classification models were the frequently used approaches. Evaluation uses F1-scores computed per image and averaged across all 10,000 test images

    Deep Learning for Audio Signal Processing

    Full text link
    Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

    Latent Multi-task Architecture Learning

    Full text link
    Multi-task learning (MTL) allows deep neural networks to learn from related tasks by sharing parameters with other networks. In practice, however, MTL involves searching an enormous space of possible parameter sharing architectures to find (a) the layers or subspaces that benefit from sharing, (b) the appropriate amount of sharing, and (c) the appropriate relative weights of the different task losses. Recent work has addressed each of the above problems in isolation. In this work we present an approach that learns a latent multi-task architecture that jointly addresses (a)--(c). We present experiments on synthetic data and data from OntoNotes 5.0, including four different tasks and seven different domains. Our extension consistently outperforms previous approaches to learning latent architectures for multi-task problems and achieves up to 15% average error reductions over common approaches to MTL.Comment: To appear in Proceedings of AAAI 201
    corecore