Search CORE

6 research outputs found

FedPC: Federated Learning for Language Generation with Personal and Context Preference Embeddings

Author: Gombolay Matthew
Silva Andrew
Tambwekar Pradyumna
Publication venue
Publication date: 07/10/2022
Field of study

Federated learning is a training paradigm that learns from multiple distributed users without aggregating data on a centralized server. Such a paradigm promises the ability to deploy machine-learning at-scale to a diverse population of end-users without first collecting a large, labeled dataset for all possible tasks. As federated learning typically averages learning updates across a decentralized population, there is a growing need for personalization of federated learning systems (i.e conversational agents must be able to personalize to a specific user's preferences). In this work, we propose a new direction for personalization research within federated learning, leveraging both personal embeddings and shared context embeddings. We also present an approach to predict these ``preference'' embeddings, enabling personalization without backpropagation. Compared to state-of-the-art personalization baselines, our approach achieves a 50\% improvement in test-time perplexity using 0.001\% of the memory required by baseline approaches, and achieving greater sample- and compute-efficiency.Comment: Andrew Silva and Pradyumna Tambwekar contributed equally towards this wor

arXiv.org e-Print Archive

Specifying and Interpreting Reinforcement Learning Policies through Simulatable Machine Learning

Author: Gombolay Matthew
Gopalan Nakul
Silva Andrew
Tambwekar Pradyumna
Publication venue
Publication date: 13/10/2021
Field of study

Human-AI collaborative policy synthesis is a procedure in which (1) a human initializes an autonomous agent's behavior, (2) Reinforcement Learning improves the human specified behavior, and (3) the agent can explain the final optimized policy to the user. This paradigm leverages human expertise and facilitates a greater insight into the learned behaviors of an agent. Existing approaches to enabling collaborative policy specification involve black box methods which are unintelligible and are not catered towards non-expert end-users. In this paper, we develop a novel collaborative framework to enable humans to initialize and interpret an autonomous agent's behavior, rooted in principles of human-centered design. Through our framework, we enable humans to specify an initial behavior model in the form of unstructured, natural language, which we then convert to lexical decision trees. Next, we are able to leverage these human-specified policies, to warm-start reinforcement learning and further allow the agent to optimize the policies through reinforcement learning. Finally, to close the loop on human-specification, we produce explanations of the final learned policy, in multiple modalities, to provide the user a final depiction about the learned policy of the agent. We validate our approach by showing that our model can produce >80% accuracy, and that human-initialized policies are able to successfully warm-start RL. We then conduct a novel human-subjects study quantifying the relative subjective and objective benefits of varying XAI modalities(e.g., Tree, Language, and Program) for explaining learned policies to end-users, in terms of usability and interpretability and identify the circumstances that influence these measures. Our findings emphasize the need for personalized explainable systems that can facilitate user-centric policy explanations for a variety of end-users

arXiv.org e-Print Archive

Controllable Neural Story Plot Generation via Reinforcement Learning

Author: Dhuliawala Murtaza
Harrison Brent
Martin Lara J.
Mehta Animesh
Riedl Mark O.
Tambwekar Pradyumna
Publication venue
Publication date: 03/06/2019
Field of study

Language-modeling--based approaches to story plot generation attempt to construct a plot by sampling from a language model (LM) to predict the next character, word, or sentence to add to the story. LM techniques lack the ability to receive guidance from the user to achieve a specific goal, resulting in stories that don't have a clear sense of progression and lack coherence. We present a reward-shaping technique that analyzes a story corpus and produces intermediate rewards that are backpropagated into a pre-trained LM in order to guide the model towards a given goal. Automated evaluations show our technique can create a model that generates story plots which consistently achieve a specified goal. Human-subject studies show that the generated stories have more plausible event ordering than baseline plot generation techniques.Comment: Published in IJCAI 201

arXiv.org e-Print Archive

Crossref

A Computational Interface to Translate Strategic Intent from Unstructured Language in a Low-Data Setting

Author: Dodeja Lakshita
Gombolay Matthew
Tambwekar Pradyumna
Vaska Nathan
Xu Wei
Publication venue
Publication date: 20/10/2023
Field of study

Many real-world tasks involve a mixed-initiative setup, wherein humans and AI systems collaboratively perform a task. While significant work has been conducted towards enabling humans to specify, through language, exactly how an agent should complete a task (i.e., low-level specification), prior work lacks on interpreting the high-level strategic intent of the human commanders. Parsing strategic intent from language will allow autonomous systems to independently operate according to the user's plan without frequent guidance or instruction. In this paper, we build a computational interface capable of translating unstructured language strategies into actionable intent in the form of goals and constraints. Leveraging a game environment, we collect a dataset of over 1000 examples, mapping language strategies to the corresponding goals and constraints, and show that our model, trained on this dataset, significantly outperforms human interpreters in inferring strategic intent (i.e., goals and constraints) from language (p < 0.05). Furthermore, we show that our model (125M parameters) significantly outperforms ChatGPT for this task (p < 0.05) in a low-data setting.Comment: 19 Pages, 7 figures, 8 page appendi

arXiv.org e-Print Archive

Learning to Generate Natural Language Rationales for Game Playing Agents

Author: Chan Larry
Ehsan Upol
Harrison Brent
Riedl Mark O.
Tambwekar Pradyumna
Publication venue: UKnowledge
Publication date: 01/01/2018
Field of study

Many computer games feature non-player charactert (NPC) teammates and companions; however, playing with or against NPCs can be frustrating when they perform unexpectedly. These frustrations can be avoided if the NPC has the ability to explain its actions and motivations. When NPC behavior is controlled by a black box AI system it can be hard to generate the necessary explanations. In this paper, we present a system that generates human-like, natural language explanations—called rationales—of an agent\u27s actions in a game environment regardless of how the decisions are made by a black box AI. We outline a robust data collection and neural network training pipeline that can be used to gather think-aloud data and train a rationale generation model for any similar sequential turn based decision making task. A human-subject study shows that our technique produces believable rationales for an agent playing the game, Frogger. We conclude with insights about how people perceive automatically generated rationales

University of Kentucky