2,402 research outputs found
From Uncertainty Data to Robust Policies for Temporal Logic Planning
We consider the problem of synthesizing robust disturbance feedback policies
for systems performing complex tasks. We formulate the tasks as linear temporal
logic specifications and encode them into an optimization framework via
mixed-integer constraints. Both the system dynamics and the specifications are
known but affected by uncertainty. The distribution of the uncertainty is
unknown, however realizations can be obtained. We introduce a data-driven
approach where the constraints are fulfilled for a set of realizations and
provide probabilistic generalization guarantees as a function of the number of
considered realizations. We use separate chance constraints for the
satisfaction of the specification and operational constraints. This allows us
to quantify their violation probabilities independently. We compute disturbance
feedback policies as solutions of mixed-integer linear or quadratic
optimization problems. By using feedback we can exploit information of past
realizations and provide feasibility for a wider range of situations compared
to static input sequences. We demonstrate the proposed method on two robust
motion-planning case studies for autonomous driving
Adaptive Contract Design for Crowdsourcing Markets: Bandit Algorithms for Repeated Principal-Agent Problems
Crowdsourcing markets have emerged as a popular platform for matching
available workers with tasks to complete. The payment for a particular task is
typically set by the task's requester, and may be adjusted based on the quality
of the completed work, for example, through the use of "bonus" payments. In
this paper, we study the requester's problem of dynamically adjusting
quality-contingent payments for tasks. We consider a multi-round version of the
well-known principal-agent model, whereby in each round a worker makes a
strategic choice of the effort level which is not directly observable by the
requester. In particular, our formulation significantly generalizes the
budget-free online task pricing problems studied in prior work.
We treat this problem as a multi-armed bandit problem, with each "arm"
representing a potential contract. To cope with the large (and in fact,
infinite) number of arms, we propose a new algorithm, AgnosticZooming, which
discretizes the contract space into a finite number of regions, effectively
treating each region as a single arm. This discretization is adaptively
refined, so that more promising regions of the contract space are eventually
discretized more finely. We analyze this algorithm, showing that it achieves
regret sublinear in the time horizon and substantially improves over
non-adaptive discretization (which is the only competing approach in the
literature).
Our results advance the state of art on several different topics: the theory
of crowdsourcing markets, principal-agent problems, multi-armed bandits, and
dynamic pricing.Comment: This is the full version of a paper in the ACM Conference on
Economics and Computation (ACM-EC), 201
Convergence of selections with applications in optimization
We consider the problem of finding an easily implemented tie-breaking rule for a convergent set-valued algorithm, i.e., a sequence of compact, non-empty subsets of a metric space converging in the Hausdorff metric. Our tie-breaking rule is determined by nearest-point selections defined by "uniqueness" points in the space, i.e., points having a unique best approximation in the limit set of the convergent algorithm. Convergence of the algorithm is shown to be equivalent to convergence of all such nearest-point selections. Under reasonable additional hypotheses, all points in the metric space have the uniqueness property. Consequently, all points yield convergent nearest-point selections, i.e., tie-breaking rules, for a convergent algorithm.We then show how to apply these results to approximate solutions for the following types of problems: infinite systems of inequalities, semi-infinite mathematical programming, non-convex optimization, and infinite horizon optimization.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/29485/1/0000571.pd
Finite dimensional approximation in infinite dimensional mathematical programming
We consider the problem of approximating an optimal solution to a separable, doubly infinite mathematical program (P) with lower staircase structure by solutions to the programs (P( N )) obtained by truncating after the first N variables and N constraints of (P). Viewing the surplus vector variable associated with the N th constraint as a state, and assuming that all feasible states are eventually reachable from any feasible state, we show that the efficient set of all solutions optimal to all possible feasible surplus states for (P( N )) converges to the set of optimal solutions to (P). A tie-breaking algorithm which selects a nearest-point efficient solution for (P( N )) is shown (for convex programs) to converge to an optimal solution to (P). A stopping rule is provided for discovering a value of N sufficiently large to guarantee any prespecified level of accuracy. The theory is illustrated by an application to production planning.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/47924/1/10107_2005_Article_BF01586057.pd
Better Optimism By Bayes: Adaptive Planning with Rich Models
The computational costs of inference and planning have confined Bayesian
model-based reinforcement learning to one of two dismal fates: powerful
Bayes-adaptive planning but only for simplistic models, or powerful, Bayesian
non-parametric models but using simple, myopic planning strategies such as
Thompson sampling. We ask whether it is feasible and truly beneficial to
combine rich probabilistic models with a closer approximation to fully Bayesian
planning. First, we use a collection of counterexamples to show formal problems
with the over-optimism inherent in Thompson sampling. Then we leverage
state-of-the-art techniques in efficient Bayes-adaptive planning and
non-parametric Bayesian methods to perform qualitatively better than both
existing conventional algorithms and Thompson sampling on two contextual
bandit-like problems.Comment: 11 pages, 11 figure
Incentivizing Truth-Telling in MPC-based Load Frequency Control
We present a mechanism for socially efficient implementation of model
predictive control (MPC) algorithms for load frequency control (LFC) in the
presence of self-interested power generators. Specifically, we consider a
situation in which the system operator seeks to implement an MPC-based LFC for
aggregated social cost minimization, but necessary information such as
individual generators' cost functions is privately owned. Without appropriate
monetary compensation mechanisms that incentivize truth-telling,
self-interested market participants may be inclined to misreport their private
parameters in an effort to maximize their own profits, which may result in a
loss of social welfare. The main challenge in our framework arises from the
fact that every participant's strategy at any time affects the future state of
other participants; the consequences of such dynamic coupling has not been
fully addressed in the literature on online mechanism design. We propose a
class of real-time monetary compensation schemes that incentivize market
participants to report their private parameters truthfully at every time step,
which enables the system operator to implement MPC-based LFC in a socially
optimal manner
A scenario approach for non-convex control design
Randomized optimization is an established tool for control design with
modulated robustness. While for uncertain convex programs there exist
randomized approaches with efficient sampling, this is not the case for
non-convex problems. Approaches based on statistical learning theory are
applicable to non-convex problems, but they usually are conservative in terms
of performance and require high sample complexity to achieve the desired
probabilistic guarantees. In this paper, we derive a novel scenario approach
for a wide class of random non-convex programs, with a sample complexity
similar to that of uncertain convex programs and with probabilistic guarantees
that hold not only for the optimal solution of the scenario program, but for
all feasible solutions inside a set of a-priori chosen complexity. We also
address measure-theoretic issues for uncertain convex and non-convex programs.
Among the family of non-convex control- design problems that can be addressed
via randomization, we apply our scenario approach to randomized Model
Predictive Control for chance-constrained nonlinear control-affine systems.Comment: Submitted to IEEE Transactions on Automatic Contro
- …