6 research outputs found
Generalization to New Sequential Decision Making Tasks with In-Context Learning
Training autonomous agents that can learn new tasks from only a handful of
demonstrations is a long-standing problem in machine learning. Recently,
transformers have been shown to learn new language or vision tasks without any
weight updates from only a few examples, also referred to as in-context
learning. However, the sequential decision making setting poses additional
challenges having a lower tolerance for errors since the environment's
stochasticity or the agent's actions can lead to unseen, and sometimes
unrecoverable, states. In this paper, we use an illustrative example to show
that naively applying transformers to sequential decision making problems does
not enable in-context learning of new tasks. We then demonstrate how training
on sequences of trajectories with certain distributional properties leads to
in-context learning of new sequential decision making tasks. We investigate
different design choices and find that larger model and dataset sizes, as well
as more task diversity, environment stochasticity, and trajectory burstiness,
all result in better in-context learning of new out-of-distribution tasks. By
training on large diverse offline datasets, our model is able to learn new
MiniHack and Procgen tasks without any weight updates from just a handful of
demonstrations
Multi-Objective GFlowNets
We study the problem of generating diverse candidates in the context of
Multi-Objective Optimization. In many applications of machine learning such as
drug discovery and material design, the goal is to generate candidates which
simultaneously optimize a set of potentially conflicting objectives. Moreover,
these objectives are often imperfect evaluations of some underlying property of
interest, making it important to generate diverse candidates to have multiple
options for expensive downstream evaluations. We propose Multi-Objective
GFlowNets (MOGFNs), a novel method for generating diverse Pareto optimal
solutions, based on GFlowNets. We introduce two variants of MOGFNs: MOGFN-PC,
which models a family of independent sub-problems defined by a scalarization
function, with reward-conditional GFlowNets, and MOGFN-AL, which solves a
sequence of sub-problems defined by an acquisition function in an active
learning loop. Our experiments on wide variety of synthetic and benchmark tasks
demonstrate advantages of the proposed methods in terms of the Pareto
performance and importantly, improved candidate diversity, which is the main
contribution of this work.Comment: 23 pages, 8 figures. ICML 2023. Code at:
https://github.com/GFNOrg/multi-objective-gf