92 research outputs found

    Dynamic Duopoly with Differentiated Goods and Sluggish Demand

    Get PDF
    This thesis investigates dynamic Bertrand competition between two firms in a market where the goods are differentiated and demand is sluggish. Unlike homogeneous goods, differentiated goods are not perfect substitutes for each other. Sluggish demand means that there is a delay in the adjustment of demand after price changes. Sluggish demand is a remarkably ignored topic in the economic literature. The competitive situation is modelled as a differential game. The dynamic model employs the demand system as in Singh and Vives 1984 and dynamics as in Wirl 2010. It is shown that the dynamic model has a unique symmetric open-loop Nash equilibrium. The long-term open-loop steady state is compared with the equilibrium point of the static model. The fundamental mathematical theory and solution methods of optimal control theory and differential games that are required in the analysis of the model are also presented in the thesis. As the main result of the analysis of the model, it is shown that when sluggishness of demand is relatively small (i.e. the adjustment of demand after price changes is sufficiently fast), sluggishness of demand increases the market power and profits of the firms in the open-loop steady state compared to the equilibrium point of the static model. After sluggishness of demand exceeds a certain point, the profits of the firms decline below the static equilibrium profits. Moreover, it is shown that product differentiation relaxes price competition also in the presence of sluggish demand, as it does in a static model.Tutkielmassa tarkastellaan kahden yrityksen välistä dynaamista Bertrand-kilpailua markkinoilla, joilla hyödykkeet ovat differoituja ja kysyntä on jäykkää. Toisin kuin homogeeniset hyödykkeet, differoidut hyödykkeet eivät ole täydellisiä korvikkeita toisilleen. Kysynnän jäykkyys tarkoittaa sitä, että kysyntä sopeutuu viiveellä hinnanmuutoksiin. Kysynnän jäykkyys on yllättävän vähälle huomiolle jäänyt aihe taloustieteellisessä kirjallisuudessa. Kilpailutilanne mallinnetaan differentiaalisena pelinä. Dynaamisen mallin kysyntäjärjestelmä on kuten Singh ja Vivesin (1984) ja dynamiikka kuten Wirlin (2010) artikkelissa. Dynaamisella mallilla osoitetaan olevan uniikki symmetrinen open-loop Nash -tasapainostrategia. Pitkän aikavälin open-loop -tasapainopistettä vertaillaan staattisen Bertrand-mallin tasapainopisteen kanssa. Tutkielmassa esitetään lisäksi mallin tarkasteluun vaadittavan optimaalisen kontrolliteorian ja differentiaalisten pelien keskeinen matemaattinen teoria ja ratkaisumenetelmät. Mallin analyysin päätuloksena osoitetaan, että kun kysynnän jäykkyys on verrattain pientä (ts. kysynnän sopeutuminen hinnanmuutoksiin on riittävän nopeaa), kysynnän jäykkyys lisää yritysten markkinavoimaa ja voittoja open-loop -tasapainopisteessä verrattuna staattiseen tasapainopisteeseen. Kun kysynnän jäykkyys ylittää tietyn pisteen, yritysten voitot laskevat staattisten tasapainovoittojen alle. Lisäksi osoitetaan, että tuotedifferointi pehmentää hintakilpailua myös kysynnän jäykkyyden olosuhteissa, aivan kuten staattisessa mallissa

    Competitive Policy Optimization

    Get PDF
    A core challenge in policy optimization in competitive Markov decision processes is the design of efficient optimization methods with desirable convergence and stability properties. To tackle this, we propose competitive policy optimization (CoPO), a novel policy gradient approach that exploits the game-theoretic nature of competitive games to derive policy updates. Motivated by the competitive gradient optimization method, we derive a bilinear approximation of the game objective. In contrast, off-the-shelf policy gradient methods utilize only linear approximations, and hence do not capture interactions among the players. We instantiate CoPO in two ways:(i) competitive policy gradient, and (ii) trust-region competitive policy optimization. We theoretically study these methods, and empirically investigate their behavior on a set of comprehensive, yet challenging, competitive games. We observe that they provide stable optimization, convergence to sophisticated strategies, and higher scores when played against baseline policy gradient methods.Comment: 11 pages main paper, 6 pages references, and 31 pages appendix. 14 figure

    Competitive Policy Optimization

    Get PDF
    A core challenge in policy optimization in competitive Markov decision processes is the design of efficient optimization methods with desirable convergence and stability properties. To tackle this, we propose competitive policy optimization (CoPO), a novel policy gradient approach that exploits the game-theoretic nature of competitive games to derive policy updates. Motivated by the competitive gradient optimization method, we derive a bilinear approximation of the game objective. In contrast, off-the-shelf policy gradient methods utilize only linear approximations, and hence do not capture interactions among the players. We instantiate CoPO in two ways:(i) competitive policy gradient, and (ii) trust-region competitive policy optimization. We theoretically study these methods, and empirically investigate their behavior on a set of comprehensive, yet challenging, competitive games. We observe that they provide stable optimization, convergence to sophisticated strategies, and higher scores when played against baseline policy gradient methods

    Learning with bounded memory.

    Get PDF
    The paper studies infinite repetition of finite strategic form games. Players use a learning behavior and face bounds on their cognitive capacities. We show that for any given beliefprobability over the set of possible outcomes where players have no experience. games can be payoff classified and there always exists a stationary state in the space of action profiles. In particular, if the belief-probability assumes all possible outcomes without experience to be equally likely, in one class of Prisoners' Dilemmas where the average defecting payoff is higher than the cooperative payoff and the average cooperative payoff is lower than the defecting payoff, play converges in the long run to the static Nash equilibrium while in the other class of Prisoners' Dilemmas where the reserve holds, play converges to cooperation. Results are applied to a large class of 2 x 2 games.Cognitive complexity; Bounded logistic quantal response learning; Long run outcomes;

    A Relative Value Iteration Algorithm for Non-degenerate Controlled Diffusions

    Full text link
    The ergodic control problem for a non-degenerate controlled diffusion controlled through its drift is considered under a uniform stability condition that ensures the well-posedness of the associated Hamilton-Jacobi-Bellman (HJB) equation. A nonlinear parabolic evolution equation is then proposed as a continuous time continuous state space analog of White's `relative value iteration' algorithm for solving the ergodic dynamic programming equation for the finite state finite action case. Its convergence to the solution of the HJB equation is established using the theory of monotone dynamical systems and also, alternatively, by using the theory of reverse martingales.Comment: 17 page

    Learning with bounded memory

    Get PDF
    The paper studies infinite repetition of finite strategic form games. Players use a learning behavior and face bounds on their cognitive capacities. We show that for any given beliefprobability over the set of possible outcomes where players have no experience. games can be payoff classified and there always exists a stationary state in the space of action profiles. In particular, if the belief-probability assumes all possible outcomes without experience to be equally likely, in one class of Prisoners' Dilemmas where the average defecting payoff is higher than the cooperative payoff and the average cooperative payoff is lower than the defecting payoff, play converges in the long run to the static Nash equilibrium while in the other class of Prisoners' Dilemmas where the reserve holds, play converges to cooperation. Results are applied to a large class of 2 x 2 games

    A Mixed Bentham-Rawls Criterion for Intergenerational Equity

    Get PDF
    This paper proposes a new welfare criterion which satisfies three desiderata: strong sensitivity to the least advantaged, sensitivity to the present, and sensitivity to the future. We develop necessary conditions for optimal paths under this new criterion, and demonstrate that, in a familiar dynamic model of capital accumulation, the optimal growth path exists. The optimal path converges to a steady state which is dependent on the initial stock of capital. Along this path, the minimum standard of living constraint, which is optimally chosen, is binding over some time interval. Optimal paths under the new criterion display properties that seem to be ethically appealing. On propose un nouveau critère d’évaluation du bien-être social qui satisfait trois propriétés : sensibilité au bien-être des membres infortunés de la société, sensibilité au bien-être des générations futures, et sensibilité au bien-être des générations présentes. On obtient les conditions nécessaires pour le sentier optimal sous ce nouveau critère et on montre que le sentier optimal existe dans un modèle d’accumulation du capital sous des conditions normales. Le sentier optimal converge à un état stationnaire qui dépend de la condition initiale. Le long de ce sentier, la contrainte sur le niveau de bien-être minimal, qui est choisi endogènement, est satisfaite avec égalité pendant une certaine phase. Les sentiers optimaux ont des propriétés qui semblent satisfaisantes sur le plan éthique.welfare, distributive justice, sustainable development, intergenerational equity, bien-être social, juste distribution, développement soutenable, équité entre les générations
    corecore