5,599 research outputs found
LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity
Cross-task generalization is a significant outcome that defines mastery in
natural language understanding. Humans show a remarkable aptitude for this, and
can solve many different types of tasks, given definitions in the form of
textual instructions and a small set of examples. Recent work with pre-trained
language models mimics this learning style: users can define and exemplify a
task for the model to attempt as a series of natural language prompts or
instructions. While prompting approaches have led to higher cross-task
generalization compared to traditional supervised learning, analyzing 'bias' in
the task instructions given to the model is a difficult problem, and has thus
been relatively unexplored. For instance, are we truly modeling a task, or are
we modeling a user's instructions? To help investigate this, we develop LINGO,
a novel visual analytics interface that supports an effective, task-driven
workflow to (1) help identify bias in natural language task instructions, (2)
alter (or create) task instructions to reduce bias, and (3) evaluate
pre-trained model performance on debiased task instructions. To robustly
evaluate LINGO, we conduct a user study with both novice and expert instruction
creators, over a dataset of 1,616 linguistic tasks and their natural language
instructions, spanning 55 different languages. For both user groups, LINGO
promotes the creation of more difficult tasks for pre-trained models, that
contain higher linguistic diversity and lower instruction bias. We additionally
discuss how the insights learned in developing and evaluating LINGO can aid in
the design of future dashboards that aim to minimize the effort involved in
prompt creation across multiple domains.Comment: 13 pages, 6 figures, Eurovis 202
- …