6 research outputs found
FedPC: Federated Learning for Language Generation with Personal and Context Preference Embeddings
Federated learning is a training paradigm that learns from multiple
distributed users without aggregating data on a centralized server. Such a
paradigm promises the ability to deploy machine-learning at-scale to a diverse
population of end-users without first collecting a large, labeled dataset for
all possible tasks. As federated learning typically averages learning updates
across a decentralized population, there is a growing need for personalization
of federated learning systems (i.e conversational agents must be able to
personalize to a specific user's preferences). In this work, we propose a new
direction for personalization research within federated learning, leveraging
both personal embeddings and shared context embeddings. We also present an
approach to predict these ``preference'' embeddings, enabling personalization
without backpropagation. Compared to state-of-the-art personalization
baselines, our approach achieves a 50\% improvement in test-time perplexity
using 0.001\% of the memory required by baseline approaches, and achieving
greater sample- and compute-efficiency.Comment: Andrew Silva and Pradyumna Tambwekar contributed equally towards this
wor
Specifying and Interpreting Reinforcement Learning Policies through Simulatable Machine Learning
Human-AI collaborative policy synthesis is a procedure in which (1) a human
initializes an autonomous agent's behavior, (2) Reinforcement Learning improves
the human specified behavior, and (3) the agent can explain the final optimized
policy to the user. This paradigm leverages human expertise and facilitates a
greater insight into the learned behaviors of an agent. Existing approaches to
enabling collaborative policy specification involve black box methods which are
unintelligible and are not catered towards non-expert end-users. In this paper,
we develop a novel collaborative framework to enable humans to initialize and
interpret an autonomous agent's behavior, rooted in principles of
human-centered design. Through our framework, we enable humans to specify an
initial behavior model in the form of unstructured, natural language, which we
then convert to lexical decision trees. Next, we are able to leverage these
human-specified policies, to warm-start reinforcement learning and further
allow the agent to optimize the policies through reinforcement learning.
Finally, to close the loop on human-specification, we produce explanations of
the final learned policy, in multiple modalities, to provide the user a final
depiction about the learned policy of the agent. We validate our approach by
showing that our model can produce >80% accuracy, and that human-initialized
policies are able to successfully warm-start RL. We then conduct a novel
human-subjects study quantifying the relative subjective and objective benefits
of varying XAI modalities(e.g., Tree, Language, and Program) for explaining
learned policies to end-users, in terms of usability and interpretability and
identify the circumstances that influence these measures. Our findings
emphasize the need for personalized explainable systems that can facilitate
user-centric policy explanations for a variety of end-users
Controllable Neural Story Plot Generation via Reinforcement Learning
Language-modeling--based approaches to story plot generation attempt to
construct a plot by sampling from a language model (LM) to predict the next
character, word, or sentence to add to the story. LM techniques lack the
ability to receive guidance from the user to achieve a specific goal, resulting
in stories that don't have a clear sense of progression and lack coherence. We
present a reward-shaping technique that analyzes a story corpus and produces
intermediate rewards that are backpropagated into a pre-trained LM in order to
guide the model towards a given goal. Automated evaluations show our technique
can create a model that generates story plots which consistently achieve a
specified goal. Human-subject studies show that the generated stories have more
plausible event ordering than baseline plot generation techniques.Comment: Published in IJCAI 201
A Computational Interface to Translate Strategic Intent from Unstructured Language in a Low-Data Setting
Many real-world tasks involve a mixed-initiative setup, wherein humans and AI
systems collaboratively perform a task. While significant work has been
conducted towards enabling humans to specify, through language, exactly how an
agent should complete a task (i.e., low-level specification), prior work lacks
on interpreting the high-level strategic intent of the human commanders.
Parsing strategic intent from language will allow autonomous systems to
independently operate according to the user's plan without frequent guidance or
instruction. In this paper, we build a computational interface capable of
translating unstructured language strategies into actionable intent in the form
of goals and constraints. Leveraging a game environment, we collect a dataset
of over 1000 examples, mapping language strategies to the corresponding goals
and constraints, and show that our model, trained on this dataset,
significantly outperforms human interpreters in inferring strategic intent
(i.e., goals and constraints) from language (p < 0.05). Furthermore, we show
that our model (125M parameters) significantly outperforms ChatGPT for this
task (p < 0.05) in a low-data setting.Comment: 19 Pages, 7 figures, 8 page appendi
Learning to Generate Natural Language Rationales for Game Playing Agents
Many computer games feature non-player charactert (NPC) teammates and companions; however, playing with or against NPCs can be frustrating when they perform unexpectedly. These frustrations can be avoided if the NPC has the ability to explain its actions and motivations. When NPC behavior is controlled by a black box AI system it can be hard to generate the necessary explanations. In this paper, we present a system that generates human-like, natural language explanations—called rationales—of an agent\u27s actions in a game environment regardless of how the decisions are made by a black box AI. We outline a robust data collection and neural network training pipeline that can be used to gather think-aloud data and train a rationale generation model for any similar sequential turn based decision making task. A human-subject study shows that our technique produces believable rationales for an agent playing the game, Frogger. We conclude with insights about how people perceive automatically generated rationales