127 research outputs found
Fact-based Agent modeling for Multi-Agent Reinforcement Learning
In multi-agent systems, agents need to interact and collaborate with other
agents in environments. Agent modeling is crucial to facilitate agent
interactions and make adaptive cooperation strategies. However, it is
challenging for agents to model the beliefs, behaviors, and intentions of other
agents in non-stationary environment where all agent policies are learned
simultaneously. In addition, the existing methods realize agent modeling
through behavior cloning which assume that the local information of other
agents can be accessed during execution or training. However, this assumption
is infeasible in unknown scenarios characterized by unknown agents, such as
competition teams, unreliable communication and federated learning due to
privacy concerns. To eliminate this assumption and achieve agent modeling in
unknown scenarios, Fact-based Agent modeling (FAM) method is proposed in which
fact-based belief inference (FBI) network models other agents in partially
observable environment only based on its local information. The reward and
observation obtained by agents after taking actions are called facts, and FAM
uses facts as reconstruction target to learn the policy representation of other
agents through a variational autoencoder. We evaluate FAM on various Multiagent
Particle Environment (MPE) and compare the results with several
state-of-the-art MARL algorithms. Experimental results show that compared with
baseline methods, FAM can effectively improve the efficiency of agent policy
learning by making adaptive cooperation strategies in multi-agent reinforcement
learning tasks, while achieving higher returns in complex
competitive-cooperative mixed scenarios
Momentum Contrastive Autoencoder: Using Contrastive Learning for Latent Space Distribution Matching in WAE
Wasserstein autoencoder (WAE) shows that matching two distributions is
equivalent to minimizing a simple autoencoder (AE) loss under the constraint
that the latent space of this AE matches a pre-specified prior distribution.
This latent space distribution matching is a core component of WAE, and a
challenging task. In this paper, we propose to use the contrastive learning
framework that has been shown to be effective for self-supervised
representation learning, as a means to resolve this problem. We do so by
exploiting the fact that contrastive learning objectives optimize the latent
space distribution to be uniform over the unit hyper-sphere, which can be
easily sampled from. We show that using the contrastive learning framework to
optimize the WAE loss achieves faster convergence and more stable optimization
compared with existing popular algorithms for WAE. This is also reflected in
the FID scores on CelebA and CIFAR-10 datasets, and the realistic generated
image quality on the CelebA-HQ dataset
Improved Online Conformal Prediction via Strongly Adaptive Online Learning
We study the problem of uncertainty quantification via prediction sets, in an
online setting where the data distribution may vary arbitrarily over time.
Recent work develops online conformal prediction techniques that leverage
regret minimization algorithms from the online learning literature to learn
prediction sets with approximately valid coverage and small regret. However,
standard regret minimization could be insufficient for handling changing
environments, where performance guarantees may be desired not only over the
full time horizon but also in all (sub-)intervals of time. We develop new
online conformal prediction methods that minimize the strongly adaptive regret,
which measures the worst-case regret over all intervals of a fixed length. We
prove that our methods achieve near-optimal strongly adaptive regret for all
interval lengths simultaneously, and approximately valid coverage. Experiments
show that our methods consistently obtain better coverage and smaller
prediction sets than existing methods on real-world tasks, such as time series
forecasting and image classification under distribution shift
- …