2,656 research outputs found
Exploratory Control with Tsallis Entropy for Latent Factor Models
We study optimal control in models with latent factors where the agent controls the distribution over actions, rather than actions themselves, in both discrete and continuous time. To encourage exploration of the state space, we reward exploration with Tsallis entropy and derive the optimal distribution over states—which we prove is q-Gaussian distributed with location characterized through the solution of an BSΔE and BSDE in discrete and continuous time, respectively. We discuss the relation between the solutions of the optimal exploration problems and the standard dynamic optimal control solution. Finally, we develop the optimal policy in a model-agnostic setting along the lines of soft Q-learning. The approach may be applied in, e.g., developing more robust statistical arbitrage trading strategies
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Graphical Nonlinear System Analysis
We use the recently introduced concept of a Scaled Relative Graph (SRG) to
develop a graphical analysis of input-output properties of feedback systems.
The SRG of a nonlinear operator generalizes the Nyquist diagram of an LTI
system. In the spirit of classical control theory, important robustness
indicators of nonlinear feedback systems are measured as distances between
SRGs.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Mean-field games of speedy information access with observation costs
We investigate a mean-field game (MFG) in which agents can exercise control
actions that affect their speed of access to information. The agents can
dynamically decide to receive observations with less delay by paying higher
observation costs. Agents seek to exploit their active information gathering by
making further decisions to influence their state dynamics to maximize rewards.
In the mean field equilibrium, each generic agent solves individually a
partially observed Markov decision problem in which the way partial
observations are obtained is itself also subject of dynamic control actions by
the agent. Based on a finite characterisation of the agents' belief states, we
show how the mean field game with controlled costly information access can be
formulated as an equivalent standard mean field game on a suitably augmented
but finite state space.We prove that with sufficient entropy regularisation, a
fixed point iteration converges to the unique MFG equilibrium and yields an
approximate -Nash equilibrium for a large but finite population size.
We illustrate our MFG by an example from epidemiology, where medical testing
results at different speeds and costs can be chosen by the agents.Comment: 33 pages, 4 figure
Control of McKean--Vlasov SDEs with Contagion Through Killing at a State-Dependent Intensity
We consider a novel McKean--Vlasov control problem with contagion through
killing of particles and common noise. Each particle is killed at an
exponential rate according to an intensity process that increases whenever the
particle is located in a specific region. The removal of a particle pushes
others towards the removal region, which can trigger cascades that see
particles exiting the system in rapid succession. We study the control of such
a system by a central agent who intends to preserve particles at minimal cost.
Our theoretical contribution is twofold. Firstly, we rigorously justify the
McKean--Vlasov control problem as the limit of a corresponding controlled
finite particle system. Our proof is based on a controlled martingale problem
and tightness arguments. Secondly, we connect our framework with models in
which particles are killed once they hit the boundary of the removal region. We
show that these models appear in the limit as the exponential rate tends to
infinity. As a corollary, we obtain new existence results for McKean--Vlasov
SDEs with singular interaction through hitting times which extend those in the
established literature. We conclude the paper with numerical investigations of
our model applied to government control of systemic risk in financial systems
Solvability of nonlinear elliptic boundary value problems
This dissertation focuses on the study of steady states of reaction diffusion problems that are motivated by applications. In particular, we focus on elliptic boundary value problems where the nonlinear reaction may appear in the interior or on the boundary of a domain in the Euclidean space. First, we study linear elliptic problems with nonlinear reaction on the boundary. In this case, we establish the existence of maximal and minimal solutions for both monotone and non monotone cases. We then extend these results to the systems case. Next, we prove the existence, nonexistence, multiplicity and global bifurcation results of positive solutions of superlinear problems. To support our analytical results we numerically approximate solutions using finite difference methods including existence and stability analysis. Second, we study problems that are nonlinear inside the domain and linear on the boundary in the context of a model arising in mathematical ecology. To begin with we perform computational simulations for the problem in the one dimensional setting. Then, motivated by the bifurcation diagrams that are obtained, we prove several analytical results such as existence, uniqueness and nonexistence
Approximating the set of Nash equilibria for convex games
In Feinstein and Rudloff (2023), it was shown that the set of Nash equilibria
for any non-cooperative player game coincides with the set of Pareto
optimal points of a certain vector optimization problem with non-convex
ordering cone. To avoid dealing with a non-convex ordering cone, an equivalent
characterization of the set of Nash equilibria as the intersection of the
Pareto optimal points of multi-objective problems (i.e.\ with the natural
ordering cone) is proven. So far, algorithms to compute the exact set of Pareto
optimal points of a multi-objective problem exist only for the class of linear
problems, which reduces the possibility of finding the true set of Nash
equilibria by those algorithms to linear games only.
In this paper, we will consider the larger class of convex games. As,
typically, only approximate solutions can be computed for convex vector
optimization problems, we first show, in total analogy to the result above,
that the set of -approximate Nash equilibria can be characterized by
the intersection of -approximate Pareto optimal points for convex
multi-objective problems. Then, we propose an algorithm based on results from
vector optimization and convex projections that allows for the computation of a
set that, on one hand, contains the set of all true Nash equilibria, and is, on
the other hand, contained in the set of -approximate Nash equilibria.
In addition to the joint convexity of the cost function for each player, this
algorithm works provided the players are restricted by either shared polyhedral
constraints or independent convex constraints
Best-Response Dynamics in Tullock Contests with Convex Costs
We study the convergence of best-response dynamics in Tullock contests with
convex cost functions (these games always have a unique pure-strategy Nash
equilibrium). We show that best-response dynamics rapidly converges to the
equilibrium for homogeneous agents. For two homogeneous agents, we show
convergence to an -approximate equilibrium in
steps. For agents, the dynamics is not
unique because at each step agents can make non-trivial moves. We
consider the model proposed by Ghosh and Goldberg (2023), where the agent
making the move is randomly selected at each time step. We show convergence to
an -approximate equilibrium in
steps with probability , where is a parameter of the agent
selection process, e.g., if agents are selected uniformly
at random at each time step. We complement this result with a lower bound of
applicable for any agent selection
process.Comment: 43 pages. WINE '23 versio
Improved guarantees for optimal Nash equilibrium seeking and bilevel variational inequalities
We consider a class of hierarchical variational inequality (VI) problems that
subsumes VI-constrained optimization and several other important problem
classes including the optimal solution selection problem, the optimal Nash
equilibrium (NE) seeking problem, and the generalized NE seeking problem. Our
main contributions are threefold. (i) We consider bilevel VIs with merely
monotone and Lipschitz continuous mappings and devise a single-timescale
iteratively regularized extragradient method (IR-EG). We improve the existing
iteration complexity results for addressing both bilevel VI and VI-constrained
convex optimization problems. (ii) Under the strong monotonicity of the outer
level mapping, we develop a variant of IR-EG, called R-EG, and derive
significantly faster guarantees than those in (i). These results appear to be
new for both bilevel VIs and VI-constrained optimization. (iii) To our
knowledge, complexity guarantees for computing the optimal NE in nonconvex
settings do not exist. Motivated by this lacuna, we consider VI-constrained
nonconvex optimization problems and devise an inexactly-projected gradient
method, called IPR-EG, where the projection onto the unknown set of equilibria
is performed using R-EG with prescribed adaptive termination criterion and
regularization parameters. We obtain new complexity guarantees in terms of a
residual map and an infeasibility metric for computing a stationary point. We
validate the theoretical findings using preliminary numerical experiments for
computing the best and the worst Nash equilibria
Reinforcement learning in large state action spaces
Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios.
This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory).
In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications
- …