9 research outputs found

    Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity

    Full text link
    Mixture-of-experts (MoE) models that employ sparse activation have demonstrated effectiveness in significantly increasing the number of parameters while maintaining low computational requirements per token. However, recent studies have established that MoE models are inherently parameter-inefficient as the improvement in performance diminishes with an increasing number of experts. We hypothesize this parameter inefficiency is a result of all experts having equal capacity, which may not adequately meet the varying complexity requirements of different tokens or tasks. In light of this, we propose Stratified Mixture of Experts (SMoE) models, which feature a stratified structure and can assign dynamic capacity to different tokens. We demonstrate the effectiveness of SMoE on three multilingual machine translation benchmarks, containing 4, 15, and 94 language pairs, respectively. We show that SMoE outperforms multiple state-of-the-art MoE models with the same or fewer parameters.Comment: Accepted at Findings of EMNLP 202

    Unsupervised image-to-video clothing transfer

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.We present a system to photo-realistically transfer the clothing of a person in a reference image into another person in an unconstrained image or video. Our architecture is based on a GAN equipped with a physical memory that updates an initially incomplete texture map of the clothes that is progressively completed with the new inferred occluded parts. The system is trained in an unsupervised manner. The results are visually appealing and open the possibility to be used in the future as a quick virtual try-on clothing system.Peer Reviewe

    The Hateful Memes Challenge: Competition Report

    No full text
    Kiela D, Firooz H, Mohan A, et al. The Hateful Memes Challenge: Competition Report. In: Escalante HJ, Hofmann K, eds. Proceedings of the NeurIPS 2020 Competition and Demonstration Track. Proceedings of Machine Learning Research. Vol 133. PMLR; 2021: 344-360.Machine learning and artificial intelligence play an ever more crucial role in mitigating important societal problems, such as the prevalence of hate speech. We describe the Hateful Memes Challenge competition, held at NeurIPS 2020, focusing on multimodal hate speech. The aim of the challenge is to facilitate further research into multimodal reasoning and understanding
    corecore