247,174 research outputs found

    Discrete mechanics and optimal control: An analysis

    Get PDF
    The optimal control of a mechanical system is of crucial importance in many application areas. Typical examples are the determination of a time-minimal path in vehicle dynamics, a minimal energy trajectory in space mission design, or optimal motion sequences in robotics and biomechanics. In most cases, some sort of discretization of the original, infinite-dimensional optimization problem has to be performed in order to make the problem amenable to computations. The approach proposed in this paper is to directly discretize the variational description of the system's motion. The resulting optimization algorithm lets the discrete solution directly inherit characteristic structural properties from the continuous one like symmetries and integrals of the motion. We show that the DMOC (Discrete Mechanics and Optimal Control) approach is equivalent to a finite difference discretization of Hamilton's equations by a symplectic partitioned Runge-Kutta scheme and employ this fact in order to give a proof of convergence. The numerical performance of DMOC and its relationship to other existing optimal control methods are investigated

    Deterministic Policy Optimization by Combining Pathwise and Score Function Estimators for Discrete Action Spaces

    Full text link
    Policy optimization methods have shown great promise in solving complex reinforcement and imitation learning tasks. While model-free methods are broadly applicable, they often require many samples to optimize complex policies. Model-based methods greatly improve sample-efficiency but at the cost of poor generalization, requiring a carefully handcrafted model of the system dynamics for each task. Recently, hybrid methods have been successful in trading off applicability for improved sample-complexity. However, these have been limited to continuous action spaces. In this work, we present a new hybrid method based on an approximation of the dynamics as an expectation over the next state under the current policy. This relaxation allows us to derive a novel hybrid policy gradient estimator, combining score function and pathwise derivative estimators, that is applicable to discrete action spaces. We show significant gains in sample complexity, ranging between 1.71.7 and 25Ɨ25\times, when learning parameterized policies on Cart Pole, Acrobot, Mountain Car and Hand Mass. Our method is applicable to both discrete and continuous action spaces, when competing pathwise methods are limited to the latter.Comment: In AAAI 2018 proceeding

    Sampling from a system-theoretic viewpoint: Part II - Noncausal solutions

    Get PDF
    This paper puts to use concepts and tools introduced in Part I to address a wide spectrum of noncausal sampling and reconstruction problems. Particularly, we follow the system-theoretic paradigm by using systems as signal generators to account for available information and system norms (L2 and Lāˆž) as performance measures. The proposed optimization-based approach recovers many known solutions, derived hitherto by different methods, as special cases under different assumptions about acquisition or reconstructing devices (e.g., polynomial and exponential cardinal splines for fixed samplers and the Sampling Theorem and its modifications in the case when both sampler and interpolator are design parameters). We also derive new results, such as versions of the Sampling Theorem for downsampling and reconstruction from noisy measurements, the continuous-time invariance of a wide class of optimal sampling-and-reconstruction circuits, etcetera
    • ā€¦
    corecore