Search CORE

508 research outputs found

Effective Structured Prompting by Meta-Learning and Representative Verbalizer

Author: Jiang Weisen
Kwok James T.
Zhang Yu
Publication venue
Publication date: 01/06/2023
Field of study

Prompt tuning for pre-trained masked language models (MLM) has shown promising performance in natural language processing tasks with few labeled examples. It tunes a prompt for the downstream task, and a verbalizer is used to bridge the predicted token and label prediction. Due to the limited training data, prompt initialization is crucial for prompt tuning. Recently, MetaPrompting (Hou et al., 2022) uses meta-learning to learn a shared initialization for all task-specific prompts. However, a single initialization is insufficient to obtain good prompts for all tasks and samples when the tasks are complex. Moreover, MetaPrompting requires tuning the whole MLM, causing a heavy burden on computation and memory as the MLM is usually large. To address these issues, we use a prompt pool to extract more task knowledge and construct instance-dependent prompts via attention. We further propose a novel soft verbalizer (RepVerb) which constructs label embedding from feature embeddings directly. Combining meta-learning the prompt pool and RepVerb, we propose MetaPrompter for effective structured prompting. MetaPrompter is parameter-efficient as only the pool is required to be tuned. Experimental results demonstrate that MetaPrompter performs better than the recent state-of-the-arts and RepVerb outperforms existing soft verbalizers.Comment: Accepted at ICML 202

arXiv.org e-Print Archive

Domain-Guided Conditional Diffusion Model for Unsupervised Domain Adaptation

Author: Chen Shuhao
Jiang Weisen
Kwok James T.
Lu Jiangang
Zhang Yu
Zhang Yulong
Publication venue
Publication date: 23/09/2023
Field of study

Limited transferability hinders the performance of deep learning models when applied to new application scenarios. Recently, Unsupervised Domain Adaptation (UDA) has achieved significant progress in addressing this issue via learning domain-invariant features. However, the performance of existing UDA methods is constrained by the large domain shift and limited target domain data. To alleviate these issues, we propose DomAin-guided Conditional Diffusion Model (DACDM) to generate high-fidelity and diversity samples for the target domain. In the proposed DACDM, by introducing class information, the labels of generated samples can be controlled, and a domain classifier is further introduced in DACDM to guide the generated samples for the target domain. The generated samples help existing UDA methods transfer from the source domain to the target domain more easily, thus improving the transfer performance. Extensive experiments on various benchmarks demonstrate that DACDM brings a large improvement to the performance of existing UDA methods.Comment: Work in progres

arXiv.org e-Print Archive

Neural-Dynamic Based Synchronous-Optimization Scheme of Dual Redundant Robot Manipulators

Author: Qiongyi Zhou
Weisen Fan
Zhijun Zhang
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

In order to track complex-path tasks in three dimensional space without joint-drifts, a neural-dynamic based synchronous-optimization (NDSO) scheme of dual redundant robot manipulators is proposed and developed. To do so, an acceleration-level repetitive motion planning optimization criterion is derived by the neural-dynamic method twice. Position and velocity feedbacks are taken into account to decrease the errors. Considering the joint-angle, joint-velocity, and joint-acceleration limits, the redundancy resolution problem of the left and right arms are formulated as two quadratic programming problems subject to equality constraints and three bound constraints. The two quadratic programming schemes of the left and right arms are then integrated into a standard quadratic programming problem constrained by an equality constraint and a bound constraint. As a real-time solver, a linear variational inequalities-based primal-dual neural network (LVI-PDNN) is used to solve the quadratic programming problem. Finally, the simulation section contains experiments of the execution of three complex tasks including a couple task, the comparison with pseudo-inverse method and robustness verification. Simulation results verify the efficacy and accuracy of the proposed NDSO scheme

Directory of Open Access Journals

Frontiers - Publisher Connector

BYOM: Building Your Own Multi-Task Model For Free

Author: Jiang Weisen
Kwok James T.
Li Zhenguo
Lin Baijiong
Shi Han
Zhang Yu
Publication venue
Publication date: 03/02/2024
Field of study

Recently, various merging methods have been proposed to build a multi-task model from task-specific finetuned models without retraining. However, existing methods suffer from a large performance deterioration compared to using multiple task-specific models. In this paper, we propose to inject task-specific knowledge into the merged model and design two parameter-efficient approaches (BYOM-FFT and BYOM-LoRA) to Build Your Own Multi-task model. BYOM-FFT is for merging fully finetuned models, while BYOM-LoRA is for LoRA-finetuned models. Both methods are data-free and computation-efficient. Extensive experiments on computer vision and natural language processing tasks show that the proposed BYOM methods outperform existing merging methods by a large margin. Moreover, BYOM-FFT is general and can be integrated into existing merging methods to further boost performance.Comment: Technical Repor

arXiv.org e-Print Archive

Backward Reasoning in Large Language Models for Verification

Author: Jiang Weisen
Kwok James T.
Li Zhenguo
Liu Zhengying
Shi Han
Yu Longhui
Zhang Yu
Publication venue
Publication date: 15/08/2023
Field of study

Chain-of-Though (CoT) prompting has shown promising performance in various reasoning tasks. Recently, Self-Consistency \citep{wang2023selfconsistency} proposes to sample a diverse set of reasoning chains which may lead to different answers while the answer that receives the most votes is selected. In this paper, we propose a novel method to use backward reasoning in verifying candidate answers. We mask a token in the question by

{\bf x}

and ask the LLM to predict the masked token when a candidate answer is provided by \textit{a simple template}, i.e., ``\textit{\textbf{If we know the answer of the above question is \{a candidate answer\}, what is the value of unknown variable

{\bf x}

?}}'' Intuitively, the LLM is expected to predict the masked token successfully if the provided candidate answer is correct. We further propose FOBAR to combine forward and backward reasoning for estimating the probability of candidate answers. We conduct extensive experiments on six data sets and three LLMs. Experimental results demonstrate that FOBAR achieves state-of-the-art performance on various reasoning benchmarks.Comment: Preprin

arXiv.org e-Print Archive