125 research outputs found

    Selective Knowledge Distillation for Non-Autoregressive Neural Machine Translation

    Full text link
    Benefiting from the sequence-level knowledge distillation, the Non-Autoregressive Transformer (NAT) achieves great success in neural machine translation tasks. However, existing knowledge distillation has side effects, such as propagating errors from the teacher to NAT students, which may limit further improvements of NAT models and are rarely discussed in existing research. In this paper, we introduce selective knowledge distillation by introducing an NAT evaluator to select NAT-friendly targets that are of high quality and easy to learn. In addition, we introduce a simple yet effective progressive distillation method to boost NAT performance. Experiment results on multiple WMT language directions and several representative NAT models show that our approach can realize a flexible trade-off between the quality and complexity of training data for NAT models, achieving strong performances. Further analysis shows that distilling only 5% of the raw translations can help an NAT outperform its counterpart trained on raw data by about 2.4 BLEU

    Does Continual Learning Equally Forget All Parameters?

    Full text link
    Distribution shift (e.g., task or domain shift) in continual learning (CL) usually results in catastrophic forgetting of neural networks. Although it can be alleviated by repeatedly replaying buffered data, the every-step replay is time-consuming. In this paper, we study which modules in neural networks are more prone to forgetting by investigating their training dynamics during CL. Our proposed metrics show that only a few modules are more task-specific and sensitively alter between tasks, while others can be shared across tasks as common knowledge. Hence, we attribute forgetting mainly to the former and find that finetuning them only on a small buffer at the end of any CL method can bring non-trivial improvement. Due to the small number of finetuned parameters, such ``Forgetting Prioritized Finetuning (FPF)'' is efficient in computation. We further propose a more efficient and simpler method that entirely removes the every-step replay and replaces them by only kk-times of FPF periodically triggered during CL. Surprisingly, this ``kk-FPF'' performs comparably to FPF and outperforms the SOTA CL methods but significantly reduces their computational overhead and cost. In experiments on several benchmarks of class- and domain-incremental CL, FPF consistently improves existing CL methods by a large margin, and kk-FPF further excels in efficiency without degrading the accuracy. We also empirically studied the impact of buffer size, epochs per task, and finetuning modules on the cost and accuracy of our methods

    Finding Sparse Structures for Domain Specific Neural Machine Translation

    Full text link
    Neural machine translation often adopts the fine-tuning approach to adapt to specific domains. However, nonrestricted fine-tuning can easily degrade on the general domain and over-fit to the target domain. To mitigate the issue, we propose Prune-Tune, a novel domain adaptation method via gradual pruning. It learns tiny domain-specific sub-networks during fine-tuning on new domains. Prune-Tune alleviates the over-fitting and the degradation problem without model modification. Furthermore, Prune-Tune is able to sequentially learn a single network with multiple disjoint domain-specific sub-networks for multiple domains. Empirical experiment results show that Prune-Tune outperforms several strong competitors in the target domain test set without sacrificing the quality on the general domain in both single and multi-domain settings. The source code and data are available at https://github.com/ohlionel/Prune-Tune.Comment: Accepted to AAAI 202

    BLEURT Has Universal Translations: An Analysis of Automatic Metrics by Minimum Risk Training

    Full text link
    Automatic metrics play a crucial role in machine translation. Despite the widespread use of n-gram-based metrics, there has been a recent surge in the development of pre-trained model-based metrics that focus on measuring sentence semantics. However, these neural metrics, while achieving higher correlations with human evaluations, are often considered to be black boxes with potential biases that are difficult to detect. In this study, we systematically analyze and compare various mainstream and cutting-edge automatic metrics from the perspective of their guidance for training machine translation systems. Through Minimum Risk Training (MRT), we find that certain metrics exhibit robustness defects, such as the presence of universal adversarial translations in BLEURT and BARTScore. In-depth analysis suggests two main causes of these robustness deficits: distribution biases in the training datasets, and the tendency of the metric paradigm. By incorporating token-level constraints, we enhance the robustness of evaluation metrics, which in turn leads to an improvement in the performance of machine translation systems. Codes are available at \url{https://github.com/powerpuffpomelo/fairseq_mrt}.Comment: Accepted to ACL 2023 main conferenc

    GigaST: A 10,000-hour Pseudo Speech Translation Corpus

    Full text link
    This paper introduces GigaST, a large-scale pseudo speech translation (ST) corpus. We create the corpus by translating the text in GigaSpeech, an English ASR corpus, into German and Chinese. The training set is translated by a strong machine translation system and the test set is translated by human. ST models trained with an addition of our corpus obtain new state-of-the-art results on the MuST-C English-German benchmark test set. We provide a detailed description of the translation process and verify its quality. We make the translated text data public and hope to facilitate research in speech translation. Additionally, we also release the training scripts on NeurST to make it easy to replicate our systems. GigaST dataset is available at https://st-benchmark.github.io/resources/GigaST.Comment: Submitted to Interspeech 2022. GigaST dataset is available at https://st-benchmark.github.io/resources/GigaS

    Comparative Analyses of H3K4 and H3K27 Trimethylations Between the Mouse Cerebrum and Testis

    Get PDF
    AbstractThe global features of H3K4 and H3K27 trimethylations (H3K4me3 and H3K27me3) have been well studied in recent years, but most of these studies were performed in mammalian cell lines. In this work, we generated the genome-wide maps of H3K4me3 and H3K27me3 of mouse cerebrum and testis using ChIP-seq and their high-coverage transcriptomes using ribominus RNA-seq with SOLiD technology. We examined the global patterns of H3K4me3 and H3K27me3 in both tissues and found that modifications are closely-associated with tissue-specific expression, function and development. Moreover, we revealed that H3K4me3 and H3K27me3 rarely occur in silent genes, which contradicts the findings in previous studies. Finally, we observed that bivalent domains, with both H3K4me3 and H3K27me3, existed ubiquitously in both tissues and demonstrated an invariable preference for the regulation of developmentally-related genes. However, the bivalent domains tend towards a “winner-takes-all” approach to regulate the expression of associated genes. We also verified the above results in mouse ES cells. As expected, the results in ES cells are consistent with those in cerebrum and testis. In conclusion, we present two very important findings. One is that H3K4me3 and H3K27me3 rarely occur in silent genes. The other is that bivalent domains may adopt a “winner-takes-all” principle to regulate gene expression

    Off-line evaluation of indoor positioning systems in different scenarios: the experiences from IPIN 2020 competition

    Get PDF
    Every year, for ten years now, the IPIN competition has aimed at evaluating real-world indoor localisation systems by testing them in a realistic environment, with realistic movement, using the EvAAL framework. The competition provided a unique overview of the state-of-the-art of systems, technologies, and methods for indoor positioning and navigation purposes. Through fair comparison of the performance achieved by each system, the competition was able to identify the most promising approaches and to pinpoint the most critical working conditions. In 2020, the competition included 5 diverse off-site off-site Tracks, each resembling real use cases and challenges for indoor positioning. The results in terms of participation and accuracy of the proposed systems have been encouraging. The best performing competitors obtained a third quartile of error of 1 m for the Smartphone Track and 0.5 m for the Foot-mounted IMU Track. While not running on physical systems, but only as algorithms, these results represent impressive achievements.Track 3 organizers were supported by the European Union’s Horizon 2020 Research and Innovation programme under the Marie Skłodowska Curie Grant 813278 (A-WEAR: A network for dynamic WEarable Applications with pRivacy constraints), MICROCEBUS (MICINN, ref. RTI2018-095168-B-C55, MCIU/AEI/FEDER UE), INSIGNIA (MICINN ref. PTQ2018-009981), and REPNIN+ (MICINN, ref. TEC2017-90808-REDT). We would like to thanks the UJI’s Library managers and employees for their support while collecting the required datasets for Track 3. Track 5 organizers were supported by JST-OPERA Program, Japan, under Grant JPMJOP1612. Track 7 organizers were supported by the Bavarian Ministry for Economic Affairs, Infrastructure, Transport and Technology through the Center for Analytics-Data-Applications (ADA-Center) within the framework of “BAYERN DIGITAL II. ” Team UMinho (Track 3) was supported by FCT—Fundação para a Ciência e Tecnologia within the R&D Units Project Scope under Grant UIDB/00319/2020, and the Ph.D. Fellowship under Grant PD/BD/137401/2018. Team YAI (Track 3) was supported by the Ministry of Science and Technology (MOST) of Taiwan under Grant MOST 109-2221-E-197-026. Team Indora (Track 3) was supported in part by the Slovak Grant Agency, Ministry of Education and Academy of Science, Slovakia, under Grant 1/0177/21, and in part by the Slovak Research and Development Agency under Contract APVV-15-0091. Team TJU (Track 3) was supported in part by the National Natural Science Foundation of China under Grant 61771338 and in part by the Tianjin Research Funding under Grant 18ZXRHSY00190. Team Next-Newbie Reckoners (Track 3) were supported by the Singapore Government through the Industry Alignment Fund—Industry Collaboration Projects Grant. This research was conducted at Singtel Cognitive and Artificial Intelligence Lab for Enterprises (SCALE@NTU), which is a collaboration between Singapore Telecommunications Limited (Singtel) and Nanyang Technological University (NTU). Team KawaguchiLab (Track 5) was supported by JSPS KAKENHI under Grant JP17H01762. Team WHU&AutoNavi (Track 6) was supported by the National Key Research and Development Program of China under Grant 2016YFB0502202. Team YAI (Tracks 6 and 7) was supported by the Ministry of Science and Technology (MOST) of Taiwan under Grant MOST 110-2634-F-155-001

    Off-Line Evaluation of Indoor Positioning Systems in Different Scenarios: The Experiences From IPIN 2020 Competition

    Get PDF
    Every year, for ten years now, the IPIN competition has aimed at evaluating real-world indoor localisation systems by testing them in a realistic environment, with realistic movement, using the EvAAL framework. The competition provided a unique overview of the state-of-the-art of systems, technologies, and methods for indoor positioning and navigation purposes. Through fair comparison of the performance achieved by each system, the competition was able to identify the most promising approaches and to pinpoint the most critical working conditions. In 2020, the competition included 5 diverse off-site off-site Tracks, each resembling real use cases and challenges for indoor positioning. The results in terms of participation and accuracy of the proposed systems have been encouraging. The best performing competitors obtained a third quartile of error of 1 m for the Smartphone Track and 0.5 m for the Foot-mounted IMU Track. While not running on physical systems, but only as algorithms, these results represent impressive achievements.Track 3 organizers were supported by the European Union’s Horizon 2020 Research and Innovation programme under the Marie Skłodowska Curie Grant 813278 (A-WEAR: A network for dynamic WEarable Applications with pRivacy constraints), MICROCEBUS (MICINN, ref. RTI2018-095168-B-C55, MCIU/AEI/FEDER UE), INSIGNIA (MICINN ref. PTQ2018-009981), and REPNIN+ (MICINN, ref. TEC2017-90808-REDT). We would like to thanks the UJI’s Library managers and employees for their support while collecting the required datasets for Track 3. Track 5 organizers were supported by JST-OPERA Program, Japan, under Grant JPMJOP1612. Track 7 organizers were supported by the Bavarian Ministry for Economic Affairs, Infrastructure, Transport and Technology through the Center for Analytics-Data-Applications (ADA-Center) within the framework of “BAYERN DIGITAL II. ” Team UMinho (Track 3) was supported by FCT—Fundação para a Ciência e Tecnologia within the R&D Units Project Scope under Grant UIDB/00319/2020, and the Ph.D. Fellowship under Grant PD/BD/137401/2018. Team YAI (Track 3) was supported by the Ministry of Science and Technology (MOST) of Taiwan under Grant MOST 109-2221-E-197-026. Team Indora (Track 3) was supported in part by the Slovak Grant Agency, Ministry of Education and Academy of Science, Slovakia, under Grant 1/0177/21, and in part by the Slovak Research and Development Agency under Contract APVV-15-0091. Team TJU (Track 3) was supported in part by the National Natural Science Foundation of China under Grant 61771338 and in part by the Tianjin Research Funding under Grant 18ZXRHSY00190. Team Next-Newbie Reckoners (Track 3) were supported by the Singapore Government through the Industry Alignment Fund—Industry Collaboration Projects Grant. This research was conducted at Singtel Cognitive and Artificial Intelligence Lab for Enterprises (SCALE@NTU), which is a collaboration between Singapore Telecommunications Limited (Singtel) and Nanyang Technological University (NTU). Team KawaguchiLab (Track 5) was supported by JSPS KAKENHI under Grant JP17H01762. Team WHU&AutoNavi (Track 6) was supported by the National Key Research and Development Program of China under Grant 2016YFB0502202. Team YAI (Tracks 6 and 7) was supported by the Ministry of Science and Technology (MOST) of Taiwan under Grant MOST 110-2634-F-155-001.Peer reviewe
    corecore