209,377 research outputs found

    A semantical approach to equilibria and rationality

    Full text link
    Game theoretic equilibria are mathematical expressions of rationality. Rational agents are used to model not only humans and their software representatives, but also organisms, populations, species and genes, interacting with each other and with the environment. Rational behaviors are achieved not only through conscious reasoning, but also through spontaneous stabilization at equilibrium points. Formal theories of rationality are usually guided by informal intuitions, which are acquired by observing some concrete economic, biological, or network processes. Treating such processes as instances of computation, we reconstruct and refine some basic notions of equilibrium and rationality from the some basic structures of computation. It is, of course, well known that equilibria arise as fixed points; the point is that semantics of computation of fixed points seems to be providing novel methods, algebraic and coalgebraic, for reasoning about them.Comment: 18 pages; Proceedings of CALCO 200

    Equivalence of Equilibrium Propagation and Recurrent Backpropagation

    Full text link
    Recurrent Backpropagation and Equilibrium Propagation are supervised learning algorithms for fixed point recurrent neural networks which differ in their second phase. In the first phase, both algorithms converge to a fixed point which corresponds to the configuration where the prediction is made. In the second phase, Equilibrium Propagation relaxes to another nearby fixed point corresponding to smaller prediction error, whereas Recurrent Backpropagation uses a side network to compute error derivatives iteratively. In this work we establish a close connection between these two algorithms. We show that, at every moment in the second phase, the temporal derivatives of the neural activities in Equilibrium Propagation are equal to the error derivatives computed iteratively by Recurrent Backpropagation in the side network. This work shows that it is not required to have a side network for the computation of error derivatives, and supports the hypothesis that, in biological neural networks, temporal derivatives of neural activities may code for error signals

    Model and Reinforcement Learning for Markov Games with Risk Preferences

    Full text link
    We motivate and propose a new model for non-cooperative Markov game which considers the interactions of risk-aware players. This model characterizes the time-consistent dynamic "risk" from both stochastic state transitions (inherent to the game) and randomized mixed strategies (due to all other players). An appropriate risk-aware equilibrium concept is proposed and the existence of such equilibria is demonstrated in stationary strategies by an application of Kakutani's fixed point theorem. We further propose a simulation-based Q-learning type algorithm for risk-aware equilibrium computation. This algorithm works with a special form of minimax risk measures which can naturally be written as saddle-point stochastic optimization problems, and covers many widely investigated risk measures. Finally, the almost sure convergence of this simulation-based algorithm to an equilibrium is demonstrated under some mild conditions. Our numerical experiments on a two player queuing game validate the properties of our model and algorithm, and demonstrate their worth and applicability in real life competitive decision-making.Comment: 38 pages, 6 tables, 5 figure

    Time-dependent Correlation Functions in Open Quadratic Fermionic Systems

    Full text link
    We formulate and discuss explicit computation of dynamic correlation functions in open quadradic fermionic systems which are driven and dissipated by the Lindblad jump processes that are linear in canonical fermionic operators. Dynamic correlators are interpreted in terms of local quantum quench where the pre-quench state is the non-equilibrium steady state, i.e. a fixed point of the Liouvillian. As an example we study the XY spin 1/2 chain and the Kitaev Majorana chains with boundary Lindblad driving, whose dynamics exhibits asymmetric (skewed) light cone behaviour. We also numerically treat the two dimensional XY model and the XY spin chain with additional Dzyaloshinskii-Moriya interactions. The latter exhibits a new non-equilibrium phase transition which can be understood in terms of bifurcations of the quasi-particle dispersion relation. Finally, considering in some detail the periodic Kitaev chain (fermionic ring) with dissipation at a single (arbitrary) site, we present analytical expressions for the first order corrections (in the strength of dissipation) to the spectrum and the non-equilibrium steady state (NESS) correlation functions.Comment: 25 pages, 10 figure

    Equilibrium Propagation: Bridging the Gap Between Energy-Based Models and Backpropagation

    Full text link
    We introduce Equilibrium Propagation, a learning framework for energy-based models. It involves only one kind of neural computation, performed in both the first phase (when the prediction is made) and the second phase of training (after the target or prediction error is revealed). Although this algorithm computes the gradient of an objective function just like Backpropagation, it does not need a special computation or circuit for the second phase, where errors are implicitly propagated. Equilibrium Propagation shares similarities with Contrastive Hebbian Learning and Contrastive Divergence while solving the theoretical issues of both algorithms: our algorithm computes the gradient of a well defined objective function. Because the objective function is defined in terms of local perturbations, the second phase of Equilibrium Propagation corresponds to only nudging the prediction (fixed point, or stationary distribution) towards a configuration that reduces prediction error. In the case of a recurrent multi-layer supervised network, the output units are slightly nudged towards their target in the second phase, and the perturbation introduced at the output layer propagates backward in the hidden layers. We show that the signal 'back-propagated' during this second phase corresponds to the propagation of error derivatives and encodes the gradient of the objective function, when the synaptic update corresponds to a standard form of spike-timing dependent plasticity. This work makes it more plausible that a mechanism similar to Backpropagation could be implemented by brains, since leaky integrator neural computation performs both inference and error back-propagation in our model. The only local difference between the two phases is whether synaptic changes are allowed or not

    A Finite Time Combinatorial Algorithm for Instantaneous Dynamic Equilibrium Flows

    Full text link
    Instantaneous dynamic equilibrium (IDE) is a standard game-theoretic concept in dynamic traffic assignment in which individual flow particles myopically select en route currently shortest paths towards their destination. We analyze IDE within the Vickrey bottleneck model, where current travel times along a path consist of the physical travel times plus the sum of waiting times in all the queues along a path. Although IDE have been studied for decades, several fundamental questions regarding equilibrium computation and complexity are not well understood. In particular, all existence results and computational methods are based on fixed-point theorems and numerical discretization schemes and no exact finite time algorithm for equilibrium computation is known to date. As our main result we show that a natural extension algorithm needs only finitely many phases to converge leading to the first finite time combinatorial algorithm computing an IDE. We complement this result by several hardness results showing that computing IDE with natural properties is NP-hard.Comment: 27 pages, 11 figure
    • …
    corecore