23 research outputs found
ALBidS: A Decision Support System for Strategic Bidding in Electricity Markets
This work demonstrates a system that provides decision support to players in electricity market negotiations. This contribution is provided by ALBidS (Adaptive Learning strategic Bidding System), a decision support system that includes a large number of distinct market negotiation strategies, and learns which should be used in each context in order to provide the best expected response. The learning process on the best negotiation strategies to use at each moment is developed by means of several integrated reinforcement learning algorithms. ALBidS is integrated with MASCEM (Multi-Agent Simulator of Competitive Electricity Markets), which enables the simulation of realistic market scenarios using real data.This work has been developed under the MAS-SOCIETY project - PTDC/EEI-EEE/28954/2017 and has received funding from UID/EEA/00760/2019, funded by FEDER Funds through COMPETE and by National Funds through FCT.info:eu-repo/semantics/publishedVersio
Decision Support System for Opponents Selection in Electricity Markets Bilateral Negotiations
This paper presents a new multi-agent decision support system with the purpose of aiding bilateral contract negotiators in the pre-negotiation phase, through the analysis of their possible opponents. The application area of this system is the electricity market, in which players trade a certain volume of energy at a specified price. Consequently, the main output of this system is the recommendation of the best opponent(s) to trade with and the target energy volume to trade with each of the opponents. These recommendations are achieved through the analysis of the possible opponents’ past behavior, namely by learning on their past actions. The result is the forecasting of the expected prices against each opponent depending on the volume to trade. The expected prices are then used by a game-theory based model, to reach the final decision on the best opponents to negotiate with and the ideal target volume to be negotiated with each of themThis work has been developed under the MAS-SOCIETY project - PTDC/EEI-EEE/28954/2017 and has received funding from UID/EEA/00760/2019, funded by FEDER Funds through COMPETE and by National Funds through FCTinfo:eu-repo/semantics/publishedVersio
AiD-EM: Adaptive Decision Support for Electricity Markets Negotiations
This paper presents the Adaptive Decision Support for Electricity Markets Negotiations (AiD-EM) system. AiD-EM is a multi-agent system that provides decision support to market players by incorporating multiple sub-(agent-based) systems, directed to the decision support of specific problems. These sub-systems make use of different artificial intelligence methodologies, such as machine learning and evolutionary computing, to enable players adaptation in the planning phase and in actual negotiations in auction-based markets and bilateral negotiations. AiD-EM demonstration is enabled by its connection to MASCEM (Multi-Agent Simulator of Competitive Electricity Markets).This work has received funding from the European Union's Horizon 2020 research and innovation programme under project DOMINOES (grant agreement No 771066) and from FEDER Funds through COMPETE program and from National Funds through FCT under the project UID/EEA/00760/2019info:eu-repo/semantics/publishedVersio
A Regularized Opponent Model with Maximum Entropy Objective
In a single-agent setting, reinforcement learning (RL) tasks can be cast into
an inference problem by introducing a binary random variable o, which stands
for the "optimality". In this paper, we redefine the binary random variable o
in multi-agent setting and formalize multi-agent reinforcement learning (MARL)
as probabilistic inference. We derive a variational lower bound of the
likelihood of achieving the optimality and name it as Regularized Opponent
Model with Maximum Entropy Objective (ROMMEO). From ROMMEO, we present a novel
perspective on opponent modeling and show how it can improve the performance of
training agents theoretically and empirically in cooperative games. To optimize
ROMMEO, we first introduce a tabular Q-iteration method ROMMEO-Q with proof of
convergence. We extend the exact algorithm to complex environments by proposing
an approximate version, ROMMEO-AC. We evaluate these two algorithms on the
challenging iterated matrix game and differential game respectively and show
that they can outperform strong MARL baselines.Comment: Accepted to International Joint Conference on Artificial Intelligence
(IJCA2019
Entropy-Regularized Stochastic Games
In zero-sum stochastic games, where two competing players make decisions under uncertainty, a pair of optimal strategies is traditionally described by Nash equilibrium and computed under the assumption that the players have perfect information about the stochastic transition model of the environment. However, implementing such strategies may make the players vulnerable to unforeseen changes in the environment. In this paper, we introduce entropy-regularized stochastic games where each player aims to maximize the causal entropy of its strategy in addition to its expected payoff. The regularization term balances each player's rationality with its belief about the level of misinformation about the transition model. We consider both entropy-regularized N-stage and entropy-regularized discounted stochastic games, and establish the existence of a value in both games. Moreover, we prove the sufficiency of Markovian and stationary mixed strategies to attain the value, respectively, in N-stage and discounted games. Finally, we present algorithms, which are based on convex optimization problems, to compute the optimal strategies. In a numerical example, we demonstrate the proposed method on a motion planning scenario and illustrate the effect of the regularization term on the expected payoff
Entropy-Regularized Stochastic Games
In two-player zero-sum stochastic games, where two competing players make
decisions under uncertainty, a pair of optimal strategies is traditionally
described by Nash equilibrium and computed under the assumption that the
players have perfect information about the stochastic transition model of the
environment. However, implementing such strategies may make the players
vulnerable to unforeseen changes in the environment. In this paper, we
introduce entropy-regularized stochastic games where each player aims to
maximize the causal entropy of its strategy in addition to its expected payoff.
The regularization term balances each player's rationality with its belief
about the level of misinformation about the transition model. We consider both
entropy-regularized -stage and entropy-regularized discounted stochastic
games, and establish the existence of a value in both games. Moreover, we prove
the sufficiency of Markovian and stationary mixed strategies to attain the
value, respectively, in -stage and discounted games. Finally, we present
algorithms, which are based on convex optimization problems, to compute the
optimal strategies. In a numerical example, we demonstrate the proposed method
on a motion planning scenario and illustrate the effect of the regularization
term on the expected payoff.Comment: Corrected typo
Entropy-Regularized Stochastic Games
In zero-sum stochastic games, where two competing players make decisions under uncertainty, a pair of optimal strategies is traditionally described by Nash equilibrium and computed under the assumption that the players have perfect information about the stochastic transition model of the environment. However, implementing such strategies may make the players vulnerable to unforeseen changes in the environment. In this paper, we introduce entropy-regularized stochastic games where each player aims to maximize the causal entropy of its strategy in addition to its expected payoff. The regularization term balances each player's rationality with its belief about the level of misinformation about the transition model. We consider both entropy-regularized N-stage and entropy-regularized discounted stochastic games, and establish the existence of a value in both games. Moreover, we prove the sufficiency of Markovian and stationary mixed strategies to attain the value, respectively, in N-stage and discounted games. Finally, we present algorithms, which are based on convex optimization problems, to compute the optimal strategies. In a numerical example, we demonstrate the proposed method on a motion planning scenario and illustrate the effect of the regularization term on the expected payoff