Search CORE

1,242 research outputs found

Optimizing Transformer for Low-Resource Neural Machine Translation

Author: Araabi Ali
Monz Christof
Publication venue
Publication date: 01/01/2020
Field of study

Language pairs with limited amounts of parallel data, also known as low-resource languages, remain a challenge for neural machine translation. While the Transformer model has achieved significant improvements for many language pairs and has become the de facto mainstream architecture, its capability under low-resource conditions has not been fully investigated yet. Our experiments on different subsets of the IWSLT14 training data show that the effectiveness of Transformer under low-resource conditions is highly dependent on the hyper-parameter settings. Our experiments show that using an optimized Transformer for low-resource conditions improves the translation quality up to 7.3 BLEU points compared to using the Transformer default settings.Comment: To be published in COLING 202

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Recommended from our members

A Composability-Based Transformer Pruning Framework

Author: Lin Yuping
Publication venue: ScholarWorks@UMass Amherst
Publication date: 14/11/2023
Field of study

This thesis addresses the crucial issue of deploying large Transformer models on resource-constrained edge devices. Given the slow training speeds, the current Transformer pruning and fine-tuning process becomes tedious and time-consuming for multiple pruning configurations. To remedy this, the research proposes a novel composability-based Transformer pruning framework, aiming to significantly reduce the time required for fine-tuning pruned models across various configurations, while maintaining model performance. Unlike traditional approaches, this study explores the composability between Transformer pruning configurations, unveiling opportunities for computational reuse. It leverages this composability within a newly proposed framework, employing techniques similar to knowledge distillation and automating the process of pruning and fine-tuning. The framework demonstrated its ability to speed up the response time needed to fine-tune a model based on the given pruning configuration, making it a practical tool for real-world deployment on edge devices. The outcome of this research is a novel method that opens up a fresh perspective on Transformer model compression, offering a reference for future studies on pruning and fine-tuning of Transformer networks

ScholarWorks@UMass Amherst

Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention

Author: Bansal Mohit
Cho Jaemin
Lei Jie
Tang Zineng
Publication venue
Publication date: 21/11/2022
Field of study

We present Perceiver-VL, a vision-and-language framework that efficiently handles high-dimensional multimodal inputs such as long videos and text. Powered by the iterative latent cross-attention of Perceiver, our framework scales with linear complexity, in contrast to the quadratic complexity of self-attention used in many state-of-the-art transformer-based models. To further improve the efficiency of our framework, we also study applying LayerDrop on cross-attention layers and introduce a mixed-stream architecture for cross-modal retrieval. We evaluate Perceiver-VL on diverse video-text and image-text benchmarks, where Perceiver-VL achieves the lowest GFLOPs and latency while maintaining competitive performance. In addition, we also provide comprehensive analyses of various aspects of our framework, including pretraining data, scalability of latent size and input size, dropping cross-attention layers at inference to reduce latency, modality aggregation strategy, positional encoding, and weight initialization strategy. Our code and checkpoints are available at: https://github.com/zinengtang/Perceiver_VLComment: WACV 2023 (first two authors contributed equally

arXiv.org e-Print Archive