Search CORE

7,094 research outputs found

Global Optimization for Value Function Approximation

Author: Petrik Marek
Zilberstein Shlomo
Publication venue
Publication date: 14/06/2010
Field of study

Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing specific norms of the Bellman residual. Solving a bilinear program optimally is NP-hard, but this is unavoidable because the Bellman-residual minimization itself is NP-hard. We describe and analyze both optimal and approximate algorithms for solving bilinear programs. The analysis shows that this algorithm offers a convergent generalization of approximate policy iteration. We also briefly analyze the behavior of bilinear programming algorithms under incomplete samples. Finally, we demonstrate that the proposed approach can consistently minimize the Bellman residual on simple benchmark problems

arXiv.org e-Print Archive

ScholarWorks@UMass Amherst

On Time Optimization of Centroidal Momentum Dynamics

Author: Del Prete Andrea
Herzog Alexander
Ponton Brahayam
Righetti Ludovic
Schaal Stefan
Publication venue
Publication date: 01/01/2018
Field of study

Recently, the centroidal momentum dynamics has received substantial attention to plan dynamically consistent motions for robots with arms and legs in multi-contact scenarios. However, it is also non convex which renders any optimization approach difficult and timing is usually kept fixed in most trajectory optimization techniques to not introduce additional non convexities to the problem. But this can limit the versatility of the algorithms. In our previous work, we proposed a convex relaxation of the problem that allowed to efficiently compute momentum trajectories and contact forces. However, our approach could not minimize a desired angular momentum objective which seriously limited its applicability. Noticing that the non-convexity introduced by the time variables is of similar nature as the centroidal dynamics one, we propose two convex relaxations to the problem based on trust regions and soft constraints. The resulting approaches can compute time-optimized dynamically consistent trajectories sufficiently fast to make the approach realtime capable. The performance of the algorithm is demonstrated in several multi-contact scenarios for a humanoid robot. In particular, we show that the proposed convex relaxation of the original problem finds solutions that are consistent with the original non-convex problem and illustrate how timing optimization allows to find motion plans that would be difficult to plan with fixed timing.Comment: 7 pages, 4 figures, ICRA 201

arXiv.org e-Print Archive

Crossref

MPG.PuRe

From Infinite to Finite Programs: Explicit Error Bounds with Applications to Approximate Dynamic Programming

Author: Esfahani Peyman Mohajerin
Kuhn Daniel
Lygeros John
Sutter Tobias
Publication venue
Publication date: 20/02/2017
Field of study

We consider linear programming (LP) problems in infinite dimensional spaces that are in general computationally intractable. Under suitable assumptions, we develop an approximation bridge from the infinite-dimensional LP to tractable finite convex programs in which the performance of the approximation is quantified explicitly. To this end, we adopt the recent developments in two areas of randomized optimization and first order methods, leading to a priori as well as a posterior performance guarantees. We illustrate the generality and implications of our theoretical results in the special case of the long-run average cost and discounted cost optimal control problems for Markov decision processes on Borel spaces. The applicability of the theoretical results is demonstrated through a constrained linear quadratic optimal control problem and a fisheries management problem.Comment: 30 pages, 5 figure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

A D-induced duality and its applications

Author: Brinkhuis J.
Zhang S.
Publication venue
Publication date
Field of study

This paper attempts to extend the notion of duality for convex cones, by basing it on a predescribed conic ordering and a fixed bilinear mapping. This is an extension of the standard definition of dual cones, in the sense that the nonnegativity of the inner-product is replaced by a pre-specified conic ordering, defined by a convex cone D, and the inner-product itself is replaced by a general multi-dimensional bilinear mapping. This new type of duality is termed the D-induced duality in the paper. We further introduce the notion of D-induced polar sets within the same framework, which can be viewed as a generalization of the D-induced polar sets within the same framework, which can be viewed as a generalization of the D-induced dual cones and are convenient to use for some practical applications. Properties of the extended duality, including the extended bi-polar theorem, are proven. Furthermore, attention is paid to the computation and approximation of the D-induced dual objects. We discuss, as examples, applications of the newly introduced D-induced duality concepts in robust conic optimization and the duality theory for multi-objective conic optimization.bi-polar theorem;conic optimization;convex cones;duality

Research Papers in Economics

K-Adaptability in Two-Stage Distributionally Robust Binary Programming

Author: Hanasusanto G
Kuhn D
Wiesemann W
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 01/03/2015
Field of study

We propose to approximate two-stage distributionally robust programs with binary recourse decisions by their associated K-adaptability problems, which pre-select K candidate secondstage policies here-and-now and implement the best of these policies once the uncertain parameters have been observed. We analyze the approximation quality and the computational complexity of the K-adaptability problem, and we derive explicit mixed-integer linear programming reformulations. We also provide efficient procedures for bounding the probabilities with which each of the K second-stage policies is selected

Spiral - Imperial College Digital Repository