5 research outputs found

    FedPC: Federated Learning for Language Generation with Personal and Context Preference Embeddings

    Full text link
    Federated learning is a training paradigm that learns from multiple distributed users without aggregating data on a centralized server. Such a paradigm promises the ability to deploy machine-learning at-scale to a diverse population of end-users without first collecting a large, labeled dataset for all possible tasks. As federated learning typically averages learning updates across a decentralized population, there is a growing need for personalization of federated learning systems (i.e conversational agents must be able to personalize to a specific user's preferences). In this work, we propose a new direction for personalization research within federated learning, leveraging both personal embeddings and shared context embeddings. We also present an approach to predict these ``preference'' embeddings, enabling personalization without backpropagation. Compared to state-of-the-art personalization baselines, our approach achieves a 50\% improvement in test-time perplexity using 0.001\% of the memory required by baseline approaches, and achieving greater sample- and compute-efficiency.Comment: Andrew Silva and Pradyumna Tambwekar contributed equally towards this wor

    Learning task requirements for coalition formation in heterogeneous multi-agent systems

    Get PDF
    Existing approaches to coalition formation assume that requirements associated with tasks are precisely specified by the human operator. However, prior work in psychology has demonstrated that humans, while extremely adept at solving complex problems, struggle to explicitly state their solution strategy. This thesis contributes two frameworks to learn implicit task requirements directly from expert demonstrations of coalition formation. In the first framework, we account for the fact that demonstrators may allocate different, equally-valid coalitions to the same task. We assume that each such coalition results in optimal task performance. Essentially, we contribute a framework to model and infer such heterogeneous strategies to coalition formation. Our framework includes a resource-aware approach to generalize the inferred strategies to new teams without requiring additional training. To this end, we formulate and solve a constrained optimization problem that simultaneously selects the most appropriate strategy for a given target team, and optimizes the constituents of its coalitions accordingly. We evaluate our approach against several baselines, including some that resemble existing approaches, using detailed numerical simulations, StarCraft II battles, and a multi-robot emergency-response scenario. Our results indicate that our framework consistently outperforms baseline approaches in terms of requirement satisfaction, resource utilization, and task success rates. Our second framework relaxes the typical assumption that the available demonstrations are optimal and incorporates interactive learning. Prior work in Learning from Demonstrations (LfD) depend on high quality demonstrations and result in poor generalization performance. Further, LfD approaches only perform as well as the best example in the demonstration set. Deviating from prior work, we assume access to sub-optimal demonstrations and evaluations of the assigned teams in the form of task-wise scores. In order to effectively learn from sub-optimal demonstrations, our framework infers the underlying reward function and subsequently generates coalitions by optimizing the inferred reward function. In addition to learning from sub-optimal demonstrations, we utilize interactions with the environment to fine tune the reward distribution and obtain an estimate of the requirements for tasks. Specifically, we develop a bandit-based approach that can deal with continuous action spaces, and can be bootstrapped by sub-optimal demonstrations. We evaluate our approach against baselines that are inspired from prior work using detailed numerical simulations. Our results show that our approach combining both passive and interactive learning achieves higher task performance when generalizing to new teams when compared to baseline approaches that are similar to imitation learning, or utilize passive learning or interactive learning in isolation.M.S
    corecore