162 research outputs found
Modeling and analysis of market share dynamics in a duopoly subject to affine feedback advertising policies and delays
Presents extensions of the Vidale-Wolfe and Lanchester models for market share duopoly dynamics. The novelties in the proposed extensions are the explicit introduction of a set of undecided clients into existing models, which consider only the sets of clients of the two competing firms, as well as the use of decentralized affine feedback advertising policies. It is shown that, under the proposed class of advertising policies, the extended Vidale-Wolfe and Lanchester models, despite having different dynamics, have equilibria in identical locations, with the same stability properties. The introduction of a third set of undecided clients also motivates the introduction of a more elaborate model of market share dynamics based on the replicator-mutator model from evolutionary game theory. This is done by identifying strategies with the entries of a preference matrix consisting of the choice preferences of firms by clients and the mutation matrix representing transition probabilities from one set of clients to another. The proposed model is analysed with respect to equilibria and their stability properties, as well as parametric sensitivity, under the proposed advertising policies. All proposed models are analysed for stability, the presence of oscillations, existence of Hopf bifurcations when implementation or adoption delays are introduced. The extended models of Vidale-Wolfe and Lanchester are robust to implementation delays, while for adoption delays, bifurcations can occur. The proposed replicator-mutator model is robust for both types of delays.Apresenta extensões aos modelos de Vidale-Wolfe e Lanchester para a dinâmica de duopólios. As novidades nas extensões propostas são a introdução explÃcita de um conjunto de clientes indecisos nos modelos existentes, os quais consideram apenas os conjuntos de clientes das duas empresas concorrentes, e o uso de polÃticas de publicidade afins com realimentação. Demonstra-se que sob a classe proposta de polÃticas de publicidade, os modelos estendidos de Vidale-Wolfe e Lanchester, apesar de terem dinâmicas diferentes, apresentam pontos de equilÃbrio idênticos com as mesmas propriedades de estabilidade. A introdução de um terceiro conjunto de clientes indecisos também motiva a introdução de um modelo mais elaborado da dinâmica de mercado baseado no modelo Replicador-Mutador da teoria dos jogos evolucionários. A proposta do modelo é realizada identificando estratégias com elementos de uma matriz de preferência consistindo nas preferências de escolha das empresas pelos clientes e a matriz de mutação representa probabilidades de transição de um conjunto de clientes para outro. O modelo proposto é analisado em relação aos pontos de equilÃbrios e suas propriedades de estabilidade, bem como a sensibilidade paramétrica sob as polÃticas de publicidade propostas. Todos os modelos propostos são analisados quanto à estabilidade, a presença de oscilações, a existência de bifurcações de Hopf quando atrasos de implementação ou de adoção são introduzidos. Os modelos estendidos de Vidale-Wolfe e Lanchester são robustos para atrasos em implementação enquanto que para o atraso de adoção apresentam a existência de bifurcações. O modelo Replicador-Mutador proposto é robusto para ambos tipos de atrasos
Shapley Value Based Multi-Agent Reinforcement Learning: Theory, Method and Its Application to Energy Network
Multi-agent reinforcement learning is an area of rapid advancement in
artificial intelligence and machine learning. One of the important questions to
be answered is how to conduct credit assignment in a multi-agent system. There
have been many schemes designed to conduct credit assignment by multi-agent
reinforcement learning algorithms. Although these credit assignment schemes
have been proved useful in improving the performance of multi-agent
reinforcement learning, most of them are designed heuristically without a
rigorous theoretic basis and therefore infeasible to understand how agents
cooperate. In this thesis, we aim at investigating the foundation of credit
assignment in multi-agent reinforcement learning via cooperative game theory.
We first extend a game model called convex game and a payoff distribution
scheme called Shapley value in cooperative game theory to Markov decision
process, named as Markov convex game and Markov Shapley value respectively. We
represent a global reward game as a Markov convex game under the grand
coalition. As a result, Markov Shapley value can be reasonably used as a credit
assignment scheme in the global reward game. Markov Shapley value possesses the
following virtues: (i) efficiency; (ii) identifiability of dummy agents; (iii)
reflecting the contribution and (iv) symmetry, which form the fair credit
assignment. Based on Markov Shapley value, we propose three multi-agent
reinforcement learning algorithms called SHAQ, SQDDPG and SMFPPO. Furthermore,
we extend Markov convex game to partial observability to deal with the
partially observable problems, named as partially observable Markov convex
game. In application, we evaluate SQDDPG and SMFPPO on the real-world problem
in energy networks.Comment: 206 page
- …