542 research outputs found

    Solution of the linear quadratic regulator problem of black box linear systems using reinforcement learning

    Get PDF
    In this paper, a Q-learning algorithm is proposed to solve the linear quadratic regulator problem of black box linear systems. The algorithm only has access to input and output measurements. A Luenberger observer parametrization is constructed using the control input and a new output obtained from a factorization of the utility function. An integral reinforcement learning approach is used to develop the Q-learning approximator structure. A gradient descent update rule is used to estimate on-line the parameters of the Q-function. Stability and convergence of the Q-learning algorithm under the Luenberger observer parametrization is assessed using Lyapunov stability theory. Simulation studies are carried out to verify the proposed approach

    Self-Learning Longitudinal Control for On-Road Vehicles

    Get PDF
    Reinforcement Learning is a promising tool to automate controller tuning. However, significant extensions are required for real-world applications to enable fast and robust learning. This work proposes several additions to the state of the art and proves their capability in a series of real world experiments

    Self-Learning Longitudinal Control for On-Road Vehicles

    Get PDF
    Fahrerassistenzsysteme (Advanced Driver Assistance Systems) sind ein wichtiges Verkaufsargument für PKWs, fordern jedoch hohe Entwicklungskosten. Insbesondere die Parametrierung für Längsregelung, die einen wichtigen Baustein für Fahrerassistenzsysteme darstellt, benötigt viel Zeit und Geld, um die richtige Balance zwischen Insassenkomfort und Regelgüte zu treffen. Reinforcement Learning scheint ein vielversprechender Ansatz zu sein, um dies zu automatisieren. Diese Klasse von Algorithmen wurde bislang allerdings vorwiegend auf simulierte Aufgaben angewendet, die unter idealen Bedingungen stattfinden und nahezu unbegrenzte Trainingszeit ermöglichen. Unter den größten Herausforderungen für die Anwendung von Reinforcement Learning in einem realen Fahrzeug sind Trajektorienfolgeregelung und unvollständige Zustandsinformationen aufgrund von nur teilweise beobachteter Dynamik. Darüber hinaus muss ein Algorithmus, der in realen Systemen angewandt wird, innerhalb von Minuten zu einem Ergebnis kommen. Außerdem kann das Regelziel sich während der Laufzeit beliebig ändern, was eine zusätzliche Schwierigkeit für Reinforcement Learning Methoden darstellt. Diese Arbeit stellt zwei Algorithmen vor, die wenig Rechenleistung benötigen und diese Hürden überwinden. Einerseits wird ein modellfreier Reinforcement Learning Ansatz vorgeschlagen, der auf der Actor-Critic-Architektur basiert und eine spezielle Struktur in der Zustandsaktionswertfunktion verwendet, um mit teilweise beobachteten Systemen eingesetzt werden zu können. Um eine Vorsteuerung zu lernen, wird ein Regler vorgeschlagen, der sich auf eine Projektion und Trainingsdatenmanipulation stützt. Andererseits wird ein modellbasierter Algorithmus vorgeschlagen, der auf Policy Search basiert. Diesem wird eine automatisierte Entwurfsmethode für eine inversionsbasierte Vorsteuerung zur Seite gestellt. Die vorgeschlagenen Algorithmen werden in einer Reihe von Szenarien verglichen, in denen sie online, d.h. während der Fahrt und bei geschlossenem Regelkreis, in einem realen Fahrzeug lernen. Obwohl die Algorithmen etwas unterschiedlich auf verschiedene Randbedingungen reagieren, lernen beide robust und zügig und sind in der Lage, sich an verschiedene Betriebspunkte, wie zum Beispiel Geschwindigkeiten und Gänge, anzupassen, auch wenn Störungen während des Trainings einwirken. Nach bestem Wissen des Autors ist dies die erste erfolgreiche Anwendung eines Reinforcement Learning Algorithmus, der online in einem realen Fahrzeug lernt

    Machine Learning-Aided Operations and Communications of Unmanned Aerial Vehicles: A Contemporary Survey

    Full text link
    The ongoing amalgamation of UAV and ML techniques is creating a significant synergy and empowering UAVs with unprecedented intelligence and autonomy. This survey aims to provide a timely and comprehensive overview of ML techniques used in UAV operations and communications and identify the potential growth areas and research gaps. We emphasise the four key components of UAV operations and communications to which ML can significantly contribute, namely, perception and feature extraction, feature interpretation and regeneration, trajectory and mission planning, and aerodynamic control and operation. We classify the latest popular ML tools based on their applications to the four components and conduct gap analyses. This survey also takes a step forward by pointing out significant challenges in the upcoming realm of ML-aided automated UAV operations and communications. It is revealed that different ML techniques dominate the applications to the four key modules of UAV operations and communications. While there is an increasing trend of cross-module designs, little effort has been devoted to an end-to-end ML framework, from perception and feature extraction to aerodynamic control and operation. It is also unveiled that the reliability and trust of ML in UAV operations and applications require significant attention before full automation of UAVs and potential cooperation between UAVs and humans come to fruition.Comment: 36 pages, 304 references, 19 Figure

    Neural adaptive mechanisms in respiratory regulation : theory and experiments

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2002.Includes bibliographical references (p. 195-215).The respiratory regulatory system is an example of a complex biological control system. The principle goal of the regulator is to preserve the chemical balance of 02, CO2 and pH in the body. Although much is known about the visceral aspects of the respiratory control system, such as lung anatomy, gas exchange, and the mechanics of breathing, considerably less is understood about the neural centers in the brainstem that give rise to known varied respiratory responses. A more complete understanding of respiratory regulation necessitates better knowledge of these underling brain mechanisms. While the task of breathing may seem straightforward, the respiratory system faces many challenges that threaten to perturb homeostasis. It has been shown that the respiratory system adapts itself to better meet changing conditions, for example to meet the stresses of high altitude or increased airway resistance. The question remains then: what neural processes in the brainstem controller participate to engender such sophisticated autonomic regulation? The primary aim of this thesis was to uncover and characterize the central adaptive mechanisms involved in modulating respiratory output. A series of in-vivo animal studies is presented that were designed to elucidate the organizational and functional principles of neural adaptation intrinsic to the respiratory control centers. In these open-loop experimental studies, afferent feedback from vagal slowly-adapting receptors and/or carotid chemoreceptors was electrically activated.(cont.) The dynamic respiratory control response was assessed by measuring the efferent activity of the phrenic nerve. Administration of pharmacological agents was used to determine the contribution of NMDA receptors to the observed responses. The roles of certain brainstem nuclei were assessed by electrical lesions. The experimental results revealed dynamic and temporal filtering properties produced by adaptation and phase-locked gating respectively in the respiratory controller. These responses demonstrated novel neural differential and integral computations specific to expiratory and inspiratory control circuits. Moreover, mechanical and chemical feedbacks were shown to adaptively modulate each other's neural transfer functions in an associative manner. Modeling and computational studies were used to assess the significance that these processes may have for stability and compensatory responses during certain physiologic states and diseases. It is suggested that these neural processes may participate in the adaptive optimal control of breathing.by Young, Daniel L. Young.Ph.D
    corecore