8 research outputs found

    Experimental design for MRI by greedy policy search

    Get PDF

    Policy search via the signed derivative

    Full text link
    Abstract — We consider policy search for reinforcement learning: learning policy parameters, for some fixed policy class, that optimize performance of a system. In this paper, we propose a novel policy gradient method based on an approximation we call the Signed Derivative; the approximation is based on the intuition that it is often very easy to guess the direction in which control inputs affect future state variables, even if we do not have an accurate model of the system. The resulting algorithm is very simple, requires no model of the environment, and we show that it can outperform standard stochastic estimators of the gradient; indeed we show that Signed Derivative algorithm can in fact perform as well as the true (model-based) policy gradient, but without knowledge of the model. We evaluate the algorithm’s performance on both a simulated task and two realworld tasks — driving an RC car along a specified trajectory, and jumping onto obstacles with an quadruped robot — and in all cases achieve good performance after very little training. I

    Control of underactuated fluid-body systems with real-time particle image velocimetry

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 141-153).Controlling the interaction of a robot with a fluid, particularly when the desired behavior is intimately related to the dynamics of the fluid, is a difficult and important problem. High-performance aircraft cannot ignore nonlinear stall effects, and robots hoping to fly and swim with performance matching that seen in birds and fish cannot treat fluid flows as quasi-steady. If we wish to match the level of performance seen in nature several major hurdles must be overcome, with one of the most difficult being the poor observability of the fluid state. Fluid dynamicists have long contended with this observability problem, and have used computationally intensive Particle Image Velocimetry (PIV) to gain an understanding of the fluid behavior after the fact. However, improvement in available computational power is now making it possible to perform PIV in real-time. When PIV provides real-time awareness of the fluid state it is no longer just an analysis tool, but rather a valuable sensor that can be integrated into the control loop. In this thesis I present methods for controlling fluid-body systems in which the fluid plays a vital dynamical role, for performing real-time PIV, and for interpreting the output of PIV in a manner useful to control. The utility of these methods is demonstrated on a mechanically simple but dynamically rich experimental platform: the hydrodynamic cartpole. This system is analogous to the well-known cart-pole system in the controls literature, but through its relationship with the surrounding fluid it captures many of the fundamental challenges of general fluid-body control tasks, including: nonlinearity, underactuation, an important and unknown fluid state and a dearth of accurate and tractable models. The first complete demonstration of closed-loop PIV control is performed on this system, and there is a statistically significant improvement in the system's ability to reject fluid disturbances when using real-time PIV for closed-loop control. These results suggest that these new techniques will push the boundaries of what we can expect a robot in a fluid to do.by John W. Roberts.Ph.D
    corecore