133 research outputs found
Word Embedding based Correlation Model for Question/Answer Matching
With the development of community based question answering (Q&A) services, a
large scale of Q&A archives have been accumulated and are an important
information and knowledge resource on the web. Question and answer matching has
been attached much importance to for its ability to reuse knowledge stored in
these systems: it can be useful in enhancing user experience with recurrent
questions. In this paper, we try to improve the matching accuracy by overcoming
the lexical gap between question and answer pairs. A Word Embedding based
Correlation (WEC) model is proposed by integrating advantages of both the
translation model and word embedding, given a random pair of words, WEC can
score their co-occurrence probability in Q&A pairs and it can also leverage the
continuity and smoothness of continuous space word representation to deal with
new pairs of words that are rare in the training parallel text. An experimental
study on Yahoo! Answers dataset and Baidu Zhidao dataset shows this new
method's promising potential.Comment: 8 pages, 2 figure
Do you really follow me? Adversarial Instructions for Evaluating the Robustness of Large Language Models
Large Language Models (LLMs) have shown remarkable proficiency in following
instructions, making them valuable in customer-facing applications. However,
their impressive capabilities also raise concerns about the amplification of
risks posed by adversarial instructions, which can be injected into the model
input by third-party attackers to manipulate LLMs' original instructions and
prompt unintended actions and content. Therefore, it is crucial to understand
LLMs' ability to accurately discern which instructions to follow to ensure
their safe deployment in real-world scenarios. In this paper, we propose a
pioneering benchmark for automatically evaluating the robustness of LLMs
against adversarial instructions. The objective of this benchmark is to
quantify the extent to which LLMs are influenced by injected adversarial
instructions and assess their ability to differentiate between these
adversarial instructions and original user instructions. Through experiments
conducted with state-of-the-art instruction-following LLMs, we uncover
significant limitations in their robustness against adversarial instruction
attacks. Furthermore, our findings indicate that prevalent instruction-tuned
models are prone to being overfitted to follow any instruction phrase in the
prompt without truly understanding which instructions should be followed. This
highlights the need to address the challenge of training models to comprehend
prompts instead of merely following instruction phrases and completing the
text.Comment: Work in progres
- …