508 research outputs found
Effective Structured Prompting by Meta-Learning and Representative Verbalizer
Prompt tuning for pre-trained masked language models (MLM) has shown
promising performance in natural language processing tasks with few labeled
examples. It tunes a prompt for the downstream task, and a verbalizer is used
to bridge the predicted token and label prediction. Due to the limited training
data, prompt initialization is crucial for prompt tuning. Recently,
MetaPrompting (Hou et al., 2022) uses meta-learning to learn a shared
initialization for all task-specific prompts. However, a single initialization
is insufficient to obtain good prompts for all tasks and samples when the tasks
are complex. Moreover, MetaPrompting requires tuning the whole MLM, causing a
heavy burden on computation and memory as the MLM is usually large. To address
these issues, we use a prompt pool to extract more task knowledge and construct
instance-dependent prompts via attention. We further propose a novel soft
verbalizer (RepVerb) which constructs label embedding from feature embeddings
directly. Combining meta-learning the prompt pool and RepVerb, we propose
MetaPrompter for effective structured prompting. MetaPrompter is
parameter-efficient as only the pool is required to be tuned. Experimental
results demonstrate that MetaPrompter performs better than the recent
state-of-the-arts and RepVerb outperforms existing soft verbalizers.Comment: Accepted at ICML 202
Domain-Guided Conditional Diffusion Model for Unsupervised Domain Adaptation
Limited transferability hinders the performance of deep learning models when
applied to new application scenarios. Recently, Unsupervised Domain Adaptation
(UDA) has achieved significant progress in addressing this issue via learning
domain-invariant features. However, the performance of existing UDA methods is
constrained by the large domain shift and limited target domain data. To
alleviate these issues, we propose DomAin-guided Conditional Diffusion Model
(DACDM) to generate high-fidelity and diversity samples for the target domain.
In the proposed DACDM, by introducing class information, the labels of
generated samples can be controlled, and a domain classifier is further
introduced in DACDM to guide the generated samples for the target domain. The
generated samples help existing UDA methods transfer from the source domain to
the target domain more easily, thus improving the transfer performance.
Extensive experiments on various benchmarks demonstrate that DACDM brings a
large improvement to the performance of existing UDA methods.Comment: Work in progres
Neural-Dynamic Based Synchronous-Optimization Scheme of Dual Redundant Robot Manipulators
In order to track complex-path tasks in three dimensional space without joint-drifts, a neural-dynamic based synchronous-optimization (NDSO) scheme of dual redundant robot manipulators is proposed and developed. To do so, an acceleration-level repetitive motion planning optimization criterion is derived by the neural-dynamic method twice. Position and velocity feedbacks are taken into account to decrease the errors. Considering the joint-angle, joint-velocity, and joint-acceleration limits, the redundancy resolution problem of the left and right arms are formulated as two quadratic programming problems subject to equality constraints and three bound constraints. The two quadratic programming schemes of the left and right arms are then integrated into a standard quadratic programming problem constrained by an equality constraint and a bound constraint. As a real-time solver, a linear variational inequalities-based primal-dual neural network (LVI-PDNN) is used to solve the quadratic programming problem. Finally, the simulation section contains experiments of the execution of three complex tasks including a couple task, the comparison with pseudo-inverse method and robustness verification. Simulation results verify the efficacy and accuracy of the proposed NDSO scheme
BYOM: Building Your Own Multi-Task Model For Free
Recently, various merging methods have been proposed to build a multi-task
model from task-specific finetuned models without retraining. However, existing
methods suffer from a large performance deterioration compared to using
multiple task-specific models. In this paper, we propose to inject
task-specific knowledge into the merged model and design two
parameter-efficient approaches (BYOM-FFT and BYOM-LoRA) to Build Your Own
Multi-task model. BYOM-FFT is for merging fully finetuned models, while
BYOM-LoRA is for LoRA-finetuned models. Both methods are data-free and
computation-efficient. Extensive experiments on computer vision and natural
language processing tasks show that the proposed BYOM methods outperform
existing merging methods by a large margin. Moreover, BYOM-FFT is general and
can be integrated into existing merging methods to further boost performance.Comment: Technical Repor
Backward Reasoning in Large Language Models for Verification
Chain-of-Though (CoT) prompting has shown promising performance in various
reasoning tasks. Recently, Self-Consistency \citep{wang2023selfconsistency}
proposes to sample a diverse set of reasoning chains which may lead to
different answers while the answer that receives the most votes is selected. In
this paper, we propose a novel method to use backward reasoning in verifying
candidate answers. We mask a token in the question by and ask the LLM
to predict the masked token when a candidate answer is provided by \textit{a
simple template}, i.e., ``\textit{\textbf{If we know the answer of the above
question is \{a candidate answer\}, what is the value of unknown variable ?}}'' Intuitively, the LLM is expected to predict the masked token
successfully if the provided candidate answer is correct. We further propose
FOBAR to combine forward and backward reasoning for estimating the probability
of candidate answers. We conduct extensive experiments on six data sets and
three LLMs. Experimental results demonstrate that FOBAR achieves
state-of-the-art performance on various reasoning benchmarks.Comment: Preprin
- …