59 research outputs found

    Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World

    Full text link
    Scene Graph Generation (SGG) aims to extract relationships in images for vision understanding. Although recent works have made steady progress on SGG, they still suffer long-tail distribution issues that tail-predicates are more costly to train and hard to distinguish due to a small amount of annotated data compared to frequent predicates. Existing re-balancing strategies try to handle it via prior rules but are still confined to pre-defined conditions, which are not scalable for various models and datasets. In this paper, we propose a Cross-modal prediCate boosting (CaCao) framework, where a visually-prompted language model is learned to generate diverse fine-grained predicates in a low-resource way. The proposed CaCao can be applied in a plug-and-play fashion and automatically strengthen existing SGG to tackle the long-tailed problem. Based on that, we further introduce a novel Entangled cross-modal prompt approach for open-world predicate scene graph generation (Epic), where models can generalize to unseen predicates in a zero-shot manner. Comprehensive experiments on three benchmark datasets show that CaCao consistently boosts the performance of multiple scene graph generation models in a model-agnostic way. Moreover, our Epic achieves competitive performance on open-world predicate prediction. The data and code for this paper are publicly available.Comment: Accepted by ICCV 202

    MiniDisc: Minimal Distillation Schedule for Language Model Compression

    Full text link
    Recent studies have uncovered that language model distillation is less effective when facing a large capacity gap between the teacher and the student, and introduced teacher assistant-based distillation to bridge the gap. As a connection, the scale and the performance of the teacher assistant is of vital importance to bring the knowledge from the teacher to the student. However, existing teacher assistant-based methods require maximally many trials before scheduling an optimal teacher assistant. To this end, we propose a minimal distillation schedule (MiniDisc) for scheduling the optimal teacher assistant in minimally one trial. In particular, motivated by the finding that the performance of the student is positively correlated to the scale-performance tradeoff of the teacher assistant, MiniDisc is designed with a λ\lambda-tradeoff to measure the optimality of the teacher assistant without trial distillation to the student. MiniDisc then can schedule the optimal teacher assistant with the best λ\lambda-tradeoff in a sandwich framework. MiniDisc is evaluated with an extensive set of experiments on GLUE. Experimental results demonstrate the improved efficiency our MiniDisc compared to several state-of-the-art baselines. We further apply MiniDisc to a language model with billions of parameters and show its scalability.Comment: Accepted to EACL 2024. Code is available at https://github.com/GeneZC/MiniDis

    XPrompt: Exploring the Extreme of Prompt Tuning

    Full text link
    Prompt tuning learns soft prompts to condition frozen Pre-trained Language Models (PLMs) for performing downstream tasks in a parameter-efficient manner. While prompt tuning has gradually reached the performance level of fine-tuning as the model scale increases, there is still a large performance gap between prompt tuning and fine-tuning for models of moderate and small scales (typically less than 11B parameters). In this paper, we empirically show that the trained prompt tokens can have a negative impact on a downstream task and thus degrade its performance. To bridge the gap, we propose a novel Prompt tuning model with an eXtremely small scale (XPrompt) under the regime of lottery tickets hypothesis. Specifically, XPrompt eliminates the negative prompt tokens at different granularity levels through a hierarchical structured pruning, yielding a more parameter-efficient prompt yet with a competitive performance. Comprehensive experiments are carried out on SuperGLUE tasks, and the extensive results indicate that XPrompt is able to close the performance gap at smaller model scales.Comment: 15 pages, accepted to EMNLP 2022 main conferenc

    Augmented Multistep Finite-Control-Set Model Predictive Control for Induction Motor-Drive System

    Get PDF
    This article develops an observer-augmented multistep model predictive control strategy with finite-control-set principle to improve the robustness of the control loop against disturbances, including external disturbances, parameter mismatches, and model uncertainties. The influence of the parameter mismatches onhttps://cris.tuni.fi/admin/editor/dk/atira/pure/api/shared/model/researchoutput/editor/contributiontojournaleditor.xhtml?scheme=&type=&showMigrationIfUnknown=true# the multistep finite-control-set model predictive control is first discussed via simulations and quantified by analyzing the probability of suboptimality. Furthermore, in order to compensate for these effects, the disturbances are included in the system model of the control problem as an extended state and estimated with a disturbance observer. The estimated disturbances as well as the system states are then delivered to the optimization problem of the current control and incorporated for the computation of the solution. The proposed method is then implemented on a dSPACE system and tested under several scenarios. The effectiveness of the proposal is validated with experimental results.Peer reviewe

    PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language Models

    Full text link
    While transformer-based pre-trained language models (PLMs) have dominated a number of NLP applications, these models are heavy to deploy and expensive to use. Therefore, effectively compressing large-scale PLMs becomes an increasingly important problem. Quantization, which represents high-precision tensors with low-bit fix-point format, is a viable solution. However, most existing quantization methods are task-specific, requiring customized training and quantization with a large number of trainable parameters on each individual task. Inspired by the observation that the over-parameterization nature of PLMs makes it possible to freeze most of the parameters during the fine-tuning stage, in this work, we propose a novel ``quantize before fine-tuning'' framework, PreQuant, that differs from both quantization-aware training and post-training quantization. PreQuant is compatible with various quantization strategies, with outlier-aware parameter-efficient fine-tuning incorporated to correct the induced quantization error. We demonstrate the effectiveness of PreQuant on the GLUE benchmark using BERT, RoBERTa, and T5. We also provide an empirical investigation into the workflow of PreQuant, which sheds light on its efficacy.Comment: Findings of ACL202

    A Fixed Switching Frequency Direct Model Predictive Control for Neutral-Point-Clamped Three-Level Inverters with Induction Machines

    Get PDF
    This article presents a direct model predictive con-trol (MPC) scheme for drive systems consisting of a three-phase three-level neutral-point-clamped (3L-NPC) inverter and an induction machine (IM). Even though the discussed MPC algorithm is a direct control strategy, it operates the inverter at a fixed switching frequency, while the output harmonic spectrum of the stator current is discrete, with harmonics at non-triplen, odd integer multiples of the fundamental frequency. As a result, the proposed method achieves similar or superior steady-state behavior than that of modulator-based control schemes. Moreover, thanks to its direct control nature, it exhibits the fast transient responses that characterize direct controllers due to the absence of an explicit modulator. Furthermore, the multiple control objectives of the system, i.e., stator current control and neutral point (NP) potential balancing, are addressed in one computational stage, thus avoiding any additional control loops in a cascaded or parallel structure. This favorable control structure is facilitated by the adopted modeling approach, according to which the system behavior is described by the gradient of the system output. In doing so, not only a simple, versatile system model is derived, but also the direct MPC can be formulated as a constrained quadratic program (QP), which can be easily solved in real time with an in-house solver. The effectiveness of the proposed control scheme is experimentally verified on a 4-kW drive system.Peer reviewe

    A Direct Model Predictive Control Strategy with an Implicit Modulator for Six-Phase PMSMs

    Get PDF
    This paper proposes a direct model predictive control (MPC) scheme for asymmetric six-phase permanent magnet synchronous machines (PMSMs), which combines control and modulation in one computation stage. By emulating the switching pattern of space vector modulation (SVM), the MPC problem is formulated as a four-dimensional current control problem where the switching sequences and instants are computed and directly applied to the inverters. This implicit modulation addresses the issue of a variable switching frequency and spread harmonic spectra of conventional direct MPC methods. Moreover, the effect of the modulation constraints and controller bandwidth on the system performance is investigated as well. To verify the effectiveness of the proposed control strategy, experiments are carried out with an asymmetric six-phase PMSM driven by two three-phase two-level inverters.acceptedVersionPeer reviewe

    CARD11 regulates the thymic Treg development in an NF-κB-independent manner

    Get PDF
    IntroductionCARD11 is a lymphoid lineage-specific scaffold protein regulating the NF-κB activation downstream of the antigen receptor signal pathway. Defective CARD11 function results in abnormal development and differentiation of lymphocytes, especially thymic regulatory T cells (Treg).MethodIn this study, we used patients’ samples together with transgenic mouse models carrying pathogenic CARD11 mutations from patients to explore their effects on Treg development. Immunoblotting and a GFP receptor assay were used to evaluate the activation effect of CARD11 mutants on NF-κB signaling. Then the suppressive function of Tregs carrying distinct CARD11 mutations was measured by in vitro suppression assay. Finally, we applied the retroviral transduced bone marrow chimeras to rescue the Treg development in an NF-κB independent manner.Results and discussWe found CARD11 mutations causing hyper-activated NF-κB signals also gave rise to compromised Treg development in the thymus, similar to the phenotype in Card11 deficient mice. This observation challenges the previous view that CARD11 regulates Treg lineage dependent on the NF-kB activation. Mechanistic investigations reveal that the noncanonical function CARD11, which negatively regulates the AKT/ FOXO1 signal pathway, is responsible for regulating Treg generation. Moreover, primary immunodeficiency patients carrying CARD11 mutation, which autonomously activates NF-κB, also represented the reduced Treg population in their peripheral blood. Our results propose a new regulatory function of CARD11 and illuminate an NF-κB independent pathway for thymic Treg lineage commitment

    Computationally Efficient Overmodulation Methods for Synchronous Motor Drive Systems

    Get PDF
    This paper presents two computationally efficient methods for selecting the optimal modulated voltage that can achieve superior dynamic performance for surface-mounted permanent magnet synchronous motors (SPMSMs). Specifically, when an SPMSM suffers a large reference or sudden load change, the controller might command a voltage reference which is beyond the range of voltages that a modulator can synthesize. In such cases, the transient behavior of the motor can deteriorate when the demanded voltage is not properly limited to the voltage boundary. To address this issue, a simple overmodulation method based on common-mode-saturation injection (CMSI) is proposed. This strategy comes with very low computational cost and can easily find the voltage vector on the boundary which is nearest to the reference voltage vector. Moreover, an alternative control method, referred to as quadratic program (QP) based deadbeat (DB) control, is proposed that also ensures optimal system performance during overmodualtion. According to this strategy, the control problem is formulated as a constrained QP, which is solved with an efficient solver based on an active-set method. Finally, extensive simulative and experimental investigations for an SPMSM are presented to demonstrate the effectiveness of the proposed overmodulation methods.acceptedVersionPeer reviewe

    A Dual Reference Frame Multistep Direct Model Predictive Current Control with a Disturbance Observer for SPMSM Drives

    Get PDF
    The parameter mismatch problem has a great impact on the control performance of model predictive control, which is however unavoidable during the operation. In order to improve the system robustness against the parameter mismatches and disturbances, an improved direct model predictive current control with a disturbance observer is proposed in this paper, where the disturbance observer is realized by an incremental moving horizon estimator. Moreover, another concern raised from the applications of direct model predictive current control is the computational burden, especially for the long-horizon implementations. Therefore, a dual reference frame solution for the surface permanent magnet synchronous motor (SPMSM) is proposed in this paper to allocate a great proportion of heavy computations required for the optimization problem to the offline preparation, which can reduce the computational burden by almost 50%50\% on average for a prediction horizon of five time steps. Besides, the parameter mismatch effects of individual electrical parameters on the control performance of the model predictive direct current control method are investigated and quantified via simulations. A five-step direct model predictive current control is implemented on a dSPACE system with a sampling frequency of \SI{20}{\kilo\hertz} to validate the effectiveness of the proposed scheme with a SPMSM drive system.acceptedVersionPeer reviewe
    • …
    corecore