47 research outputs found

    Selective Knowledge Distillation for Non-Autoregressive Neural Machine Translation

    Full text link
    Benefiting from the sequence-level knowledge distillation, the Non-Autoregressive Transformer (NAT) achieves great success in neural machine translation tasks. However, existing knowledge distillation has side effects, such as propagating errors from the teacher to NAT students, which may limit further improvements of NAT models and are rarely discussed in existing research. In this paper, we introduce selective knowledge distillation by introducing an NAT evaluator to select NAT-friendly targets that are of high quality and easy to learn. In addition, we introduce a simple yet effective progressive distillation method to boost NAT performance. Experiment results on multiple WMT language directions and several representative NAT models show that our approach can realize a flexible trade-off between the quality and complexity of training data for NAT models, achieving strong performances. Further analysis shows that distilling only 5% of the raw translations can help an NAT outperform its counterpart trained on raw data by about 2.4 BLEU

    Language Model Weight Adaptation Based on Cross-entropy for Statistical Machine Translation

    Get PDF

    Latent Opinions Transfer Network for Target-Oriented Opinion Words Extraction

    Full text link
    Target-oriented opinion words extraction (TOWE) is a new subtask of ABSA, which aims to extract the corresponding opinion words for a given opinion target in a sentence. Recently, neural network methods have been applied to this task and achieve promising results. However, the difficulty of annotation causes the datasets of TOWE to be insufficient, which heavily limits the performance of neural models. By contrast, abundant review sentiment classification data are easily available at online review sites. These reviews contain substantial latent opinions information and semantic patterns. In this paper, we propose a novel model to transfer these opinions knowledge from resource-rich review sentiment classification datasets to low-resource task TOWE. To address the challenges in the transfer process, we design an effective transformation method to obtain latent opinions, then integrate them into TOWE. Extensive experimental results show that our model achieves better performance compared to other state-of-the-art methods and significantly outperforms the base model without transferring opinions knowledge. Further analysis validates the effectiveness of our model.Comment: Accepted by the 34th AAAI Conference on Artificial Intelligence (AAAI 2020

    BLEURT Has Universal Translations: An Analysis of Automatic Metrics by Minimum Risk Training

    Full text link
    Automatic metrics play a crucial role in machine translation. Despite the widespread use of n-gram-based metrics, there has been a recent surge in the development of pre-trained model-based metrics that focus on measuring sentence semantics. However, these neural metrics, while achieving higher correlations with human evaluations, are often considered to be black boxes with potential biases that are difficult to detect. In this study, we systematically analyze and compare various mainstream and cutting-edge automatic metrics from the perspective of their guidance for training machine translation systems. Through Minimum Risk Training (MRT), we find that certain metrics exhibit robustness defects, such as the presence of universal adversarial translations in BLEURT and BARTScore. In-depth analysis suggests two main causes of these robustness deficits: distribution biases in the training datasets, and the tendency of the metric paradigm. By incorporating token-level constraints, we enhance the robustness of evaluation metrics, which in turn leads to an improvement in the performance of machine translation systems. Codes are available at \url{https://github.com/powerpuffpomelo/fairseq_mrt}.Comment: Accepted to ACL 2023 main conferenc

    Research on the mechanism of neutral-point voltage fluctuation and capacitor voltage balancing control strategy of three-phase three-level T-type inverter

    Get PDF
    In order to solve the neutral-point voltage fluctuation problem of three-phase threelevel T-type inverters (TPTLTIs), the unbalance characteristics of capacitor voltages under different switching states and the mechanism of neutral-point voltage fluctuation are revealed. Based on the mathematical model of a TPTLTI, a feed-forward voltage balancing control strategy of DC-link capacitor voltages error is proposed. The strategy generates a DC bias voltage using a capacitor voltage loop with a proportional integral (PI) controller. The proposed strategy can suppress the neutral-point voltage fluctuation effectively and improve the quality of output currents. The correctness of the theoretical analysis is verified through simulations. An experimental prototype of a TPTLTI based on Digital Signal Processor (DSP) is built. The feasibility and effectiveness of the proposed strategy is verified through experiment. The results from simulations and experiment match very well

    Study on dynamic strength and liquefaction mechanism of silt soil in Castor earthquake prone areas under different consolidation ratios

    Get PDF
    Under the Castor earthquake, there is a risk of liquefaction instability of saturated tailings, and the evolution of dynamic pore pressure can indirectly reflect its instability process. Before applying dynamic loads, the static stress state of soil is one of the main factors affecting the development of soil dynamic strength and dynamic pore pressure, and there are significant differences in soil dynamic strength under different consolidation ratios. This paper conducted dynamic triaxial tests on saturated tailings silt with different consolidation ratios, and analyzed the dynamic strength variation and liquefaction mechanism of the samples using the discrete element method (PFC3D). The results showed that 1) as the Kcā€² gradually increased, and there was a critical consolidation ratio Kcā€² during the development of the dynamic strength of the sample. The specific value of Kcā€² was related to the properties and stress state of saturated sand. The Kcā€² in this research was about 1.9. When Kc < 1.9, dynamic strength was increased with the increase in Kc; when Kc > 1.9, dynamic strength was decreased with the Kc. 2) Under the impact of cyclic load, when samples were normally consolidated (Kc =1), the pore water pressure would tend to be equal to the confining pressure to cause soil liquefaction. In the case of eccentric consolidation (Kc > 1), the pore water pressure would be less than the confining pressure, thus, the soil liquefaction would not be induced, and the pore pressure value would decrease with the increase of consolidation ratio. This paper provides engineering guidance value for the study of dynamic strength and liquefaction mechanism of tailings sand and silt in Castor earthquake prone areas under different consolidation ratios

    Aldehyde Dehydrogenase-2 Attenuates Myocardial Remodeling and Contractile Dysfunction Induced by a High-Fat Diet

    Get PDF
    Background/Aims: Consumption of a high-fat (HF) diet exacerbates metabolic cardiomyopathy through lipotoxic mechanisms. In this study, we explored the role of aldehyde dehydrogenase-2 (ALDH2) in myocardial damage induced by a HF diet. Methods: Wild-type C57 BL/6J mice were fed a HF diet or control diet for 16 weeks. ALDH2 overexpression was achieved by injecting a lentiviral ALDH2 expression vector into the left ventricle. Results: Consumption of a HF diet induced metabolic syndrome and myocardial remodeling, and these deleterious effects were attenuated by ALDH2 overexpression. In addition, ALDH2 overexpression attenuated the cellular apoptosis and insulin resistance associated with a HF diet. Mechanistically, ALDH2 overexpression inhibited the expression of c-Jun N-terminal kinase (JNK)-1, activated protein 1 (AP-1), insulin receptor substrate 1 (IRS-1), 4- hydroxynonenal, caspase 3, transforming growth factor Ī²1, and collagen I and III, and enhanced Akt phosphorylation. Conclusion: ALDH2 may effectively attenuate myocardial remodeling and contractile defects induced by a HF diet through the regulation of the JNK/AP-1 and IRS-1/Akt signaling pathways. Our study demonstrates that ALDH2 plays an essential role in protecting cardiac function from lipotoxic cardiomyopathy

    Local Memory Search Bat Algorithm for Grey Economic Dynamic System

    No full text
    Control system is a pattern for describing microeconomic performance, so it can provide theory basis for policy-making to make economicĀ performance well and continuously by analyzing and solving the model of economic control system. After analyzing the characteristics of Bat Algorithm (BA), the method to adjust each step of BA is proposed. In the method, each bat took advantage of the optimal location that it had found to guide the direction of search. The result of the case study showed that the proposed algorithm was efficient, then the proposed algorithm was used to solve the grey economic dynamic system, and the results further showed that the method was valid for solving economic control problems.Ā DOI: http://dx.doi.org/10.11591/telkomnika.v11i9.314

    Selective Knowledge Distillation for Non-Autoregressive Neural Machine Translation

    No full text
    Benefiting from the sequence-level knowledge distillation, the Non-Autoregressive Transformer (NAT) achieves great success in neural machine translation tasks. However, existing knowledge distillation has side effects, such as propagating errors from the teacher to NAT students, which may limit further improvements of NAT models and are rarely discussed in existing research. In this paper, we introduce selective knowledge distillation by introducing an NAT evaluator to select NAT-friendly targets that are of high quality and easy to learn. In addition, we introduce a simple yet effective progressive distillation method to boost NAT performance. Experiment results on multiple WMT language directions and several representative NAT models show that our approach can realize a flexible trade-off between the quality and complexity of training data for NAT models, achieving strong performances. Further analysis shows that distilling only 5% of the raw translations can help an NAT outperform its counterpart trained on raw data by about 2.4 BLEU
    corecore