162 research outputs found

    Modeling and analysis of market share dynamics in a duopoly subject to affine feedback advertising policies and delays

    Get PDF
    Presents extensions of the Vidale-Wolfe and Lanchester models for market share duopoly dynamics. The novelties in the proposed extensions are the explicit introduction of a set of undecided clients into existing models, which consider only the sets of clients of the two competing firms, as well as the use of decentralized affine feedback advertising policies. It is shown that, under the proposed class of advertising policies, the extended Vidale-Wolfe and Lanchester models, despite having different dynamics, have equilibria in identical locations, with the same stability properties. The introduction of a third set of undecided clients also motivates the introduction of a more elaborate model of market share dynamics based on the replicator-mutator model from evolutionary game theory. This is done by identifying strategies with the entries of a preference matrix consisting of the choice preferences of firms by clients and the mutation matrix representing transition probabilities from one set of clients to another. The proposed model is analysed with respect to equilibria and their stability properties, as well as parametric sensitivity, under the proposed advertising policies. All proposed models are analysed for stability, the presence of oscillations, existence of Hopf bifurcations when implementation or adoption delays are introduced. The extended models of Vidale-Wolfe and Lanchester are robust to implementation delays, while for adoption delays, bifurcations can occur. The proposed replicator-mutator model is robust for both types of delays.Apresenta extensões aos modelos de Vidale-Wolfe e Lanchester para a dinâmica de duopólios. As novidades nas extensões propostas são a introdução explícita de um conjunto de clientes indecisos nos modelos existentes, os quais consideram apenas os conjuntos de clientes das duas empresas concorrentes, e o uso de políticas de publicidade afins com realimentação. Demonstra-se que sob a classe proposta de políticas de publicidade, os modelos estendidos de Vidale-Wolfe e Lanchester, apesar de terem dinâmicas diferentes, apresentam pontos de equilíbrio idênticos com as mesmas propriedades de estabilidade. A introdução de um terceiro conjunto de clientes indecisos também motiva a introdução de um modelo mais elaborado da dinâmica de mercado baseado no modelo Replicador-Mutador da teoria dos jogos evolucionários. A proposta do modelo é realizada identificando estratégias com elementos de uma matriz de preferência consistindo nas preferências de escolha das empresas pelos clientes e a matriz de mutação representa probabilidades de transição de um conjunto de clientes para outro. O modelo proposto é analisado em relação aos pontos de equilíbrios e suas propriedades de estabilidade, bem como a sensibilidade paramétrica sob as políticas de publicidade propostas. Todos os modelos propostos são analisados quanto à estabilidade, a presença de oscilações, a existência de bifurcações de Hopf quando atrasos de implementação ou de adoção são introduzidos. Os modelos estendidos de Vidale-Wolfe e Lanchester são robustos para atrasos em implementação enquanto que para o atraso de adoção apresentam a existência de bifurcações. O modelo Replicador-Mutador proposto é robusto para ambos tipos de atrasos

    Shapley Value Based Multi-Agent Reinforcement Learning: Theory, Method and Its Application to Energy Network

    Full text link
    Multi-agent reinforcement learning is an area of rapid advancement in artificial intelligence and machine learning. One of the important questions to be answered is how to conduct credit assignment in a multi-agent system. There have been many schemes designed to conduct credit assignment by multi-agent reinforcement learning algorithms. Although these credit assignment schemes have been proved useful in improving the performance of multi-agent reinforcement learning, most of them are designed heuristically without a rigorous theoretic basis and therefore infeasible to understand how agents cooperate. In this thesis, we aim at investigating the foundation of credit assignment in multi-agent reinforcement learning via cooperative game theory. We first extend a game model called convex game and a payoff distribution scheme called Shapley value in cooperative game theory to Markov decision process, named as Markov convex game and Markov Shapley value respectively. We represent a global reward game as a Markov convex game under the grand coalition. As a result, Markov Shapley value can be reasonably used as a credit assignment scheme in the global reward game. Markov Shapley value possesses the following virtues: (i) efficiency; (ii) identifiability of dummy agents; (iii) reflecting the contribution and (iv) symmetry, which form the fair credit assignment. Based on Markov Shapley value, we propose three multi-agent reinforcement learning algorithms called SHAQ, SQDDPG and SMFPPO. Furthermore, we extend Markov convex game to partial observability to deal with the partially observable problems, named as partially observable Markov convex game. In application, we evaluate SQDDPG and SMFPPO on the real-world problem in energy networks.Comment: 206 page
    • …
    corecore