209,424 research outputs found
A semantical approach to equilibria and rationality
Game theoretic equilibria are mathematical expressions of rationality.
Rational agents are used to model not only humans and their software
representatives, but also organisms, populations, species and genes,
interacting with each other and with the environment. Rational behaviors are
achieved not only through conscious reasoning, but also through spontaneous
stabilization at equilibrium points.
Formal theories of rationality are usually guided by informal intuitions,
which are acquired by observing some concrete economic, biological, or network
processes. Treating such processes as instances of computation, we reconstruct
and refine some basic notions of equilibrium and rationality from the some
basic structures of computation.
It is, of course, well known that equilibria arise as fixed points; the point
is that semantics of computation of fixed points seems to be providing novel
methods, algebraic and coalgebraic, for reasoning about them.Comment: 18 pages; Proceedings of CALCO 200
Equivalence of Equilibrium Propagation and Recurrent Backpropagation
Recurrent Backpropagation and Equilibrium Propagation are supervised learning
algorithms for fixed point recurrent neural networks which differ in their
second phase. In the first phase, both algorithms converge to a fixed point
which corresponds to the configuration where the prediction is made. In the
second phase, Equilibrium Propagation relaxes to another nearby fixed point
corresponding to smaller prediction error, whereas Recurrent Backpropagation
uses a side network to compute error derivatives iteratively. In this work we
establish a close connection between these two algorithms. We show that, at
every moment in the second phase, the temporal derivatives of the neural
activities in Equilibrium Propagation are equal to the error derivatives
computed iteratively by Recurrent Backpropagation in the side network. This
work shows that it is not required to have a side network for the computation
of error derivatives, and supports the hypothesis that, in biological neural
networks, temporal derivatives of neural activities may code for error signals
Model and Reinforcement Learning for Markov Games with Risk Preferences
We motivate and propose a new model for non-cooperative Markov game which
considers the interactions of risk-aware players. This model characterizes the
time-consistent dynamic "risk" from both stochastic state transitions (inherent
to the game) and randomized mixed strategies (due to all other players). An
appropriate risk-aware equilibrium concept is proposed and the existence of
such equilibria is demonstrated in stationary strategies by an application of
Kakutani's fixed point theorem. We further propose a simulation-based
Q-learning type algorithm for risk-aware equilibrium computation. This
algorithm works with a special form of minimax risk measures which can
naturally be written as saddle-point stochastic optimization problems, and
covers many widely investigated risk measures. Finally, the almost sure
convergence of this simulation-based algorithm to an equilibrium is
demonstrated under some mild conditions. Our numerical experiments on a two
player queuing game validate the properties of our model and algorithm, and
demonstrate their worth and applicability in real life competitive
decision-making.Comment: 38 pages, 6 tables, 5 figure
Time-dependent Correlation Functions in Open Quadratic Fermionic Systems
We formulate and discuss explicit computation of dynamic correlation
functions in open quadradic fermionic systems which are driven and dissipated
by the Lindblad jump processes that are linear in canonical fermionic
operators. Dynamic correlators are interpreted in terms of local quantum quench
where the pre-quench state is the non-equilibrium steady state, i.e. a fixed
point of the Liouvillian. As an example we study the XY spin 1/2 chain and the
Kitaev Majorana chains with boundary Lindblad driving, whose dynamics exhibits
asymmetric (skewed) light cone behaviour. We also numerically treat the two
dimensional XY model and the XY spin chain with additional
Dzyaloshinskii-Moriya interactions. The latter exhibits a new non-equilibrium
phase transition which can be understood in terms of bifurcations of the
quasi-particle dispersion relation. Finally, considering in some detail the
periodic Kitaev chain (fermionic ring) with dissipation at a single (arbitrary)
site, we present analytical expressions for the first order corrections (in the
strength of dissipation) to the spectrum and the non-equilibrium steady state
(NESS) correlation functions.Comment: 25 pages, 10 figure
Equilibrium Propagation: Bridging the Gap Between Energy-Based Models and Backpropagation
We introduce Equilibrium Propagation, a learning framework for energy-based
models. It involves only one kind of neural computation, performed in both the
first phase (when the prediction is made) and the second phase of training
(after the target or prediction error is revealed). Although this algorithm
computes the gradient of an objective function just like Backpropagation, it
does not need a special computation or circuit for the second phase, where
errors are implicitly propagated. Equilibrium Propagation shares similarities
with Contrastive Hebbian Learning and Contrastive Divergence while solving the
theoretical issues of both algorithms: our algorithm computes the gradient of a
well defined objective function. Because the objective function is defined in
terms of local perturbations, the second phase of Equilibrium Propagation
corresponds to only nudging the prediction (fixed point, or stationary
distribution) towards a configuration that reduces prediction error. In the
case of a recurrent multi-layer supervised network, the output units are
slightly nudged towards their target in the second phase, and the perturbation
introduced at the output layer propagates backward in the hidden layers. We
show that the signal 'back-propagated' during this second phase corresponds to
the propagation of error derivatives and encodes the gradient of the objective
function, when the synaptic update corresponds to a standard form of
spike-timing dependent plasticity. This work makes it more plausible that a
mechanism similar to Backpropagation could be implemented by brains, since
leaky integrator neural computation performs both inference and error
back-propagation in our model. The only local difference between the two phases
is whether synaptic changes are allowed or not
A Finite Time Combinatorial Algorithm for Instantaneous Dynamic Equilibrium Flows
Instantaneous dynamic equilibrium (IDE) is a standard game-theoretic concept
in dynamic traffic assignment in which individual flow particles myopically
select en route currently shortest paths towards their destination. We analyze
IDE within the Vickrey bottleneck model, where current travel times along a
path consist of the physical travel times plus the sum of waiting times in all
the queues along a path. Although IDE have been studied for decades, several
fundamental questions regarding equilibrium computation and complexity are not
well understood. In particular, all existence results and computational methods
are based on fixed-point theorems and numerical discretization schemes and no
exact finite time algorithm for equilibrium computation is known to date. As
our main result we show that a natural extension algorithm needs only finitely
many phases to converge leading to the first finite time combinatorial
algorithm computing an IDE. We complement this result by several hardness
results showing that computing IDE with natural properties is NP-hard.Comment: 27 pages, 11 figure
- …