Search CORE

103 research outputs found

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Author: Burdick Joel W.
Cheng Richard
Murray Richard M.
Orosz Gabor
Publication venue
Publication date: 01/02/2019
Field of study

Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) on-line learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework leverages the success of RL algorithms to learn high-performance controllers, while the CBF-based controllers both guarantee safety and guide the learning process by constraining the set of explorable polices. We utilize Gaussian Processes (GPs) to model the system dynamics and its uncertainties. Our novel controller synthesis algorithm, RL-CBF, guarantees safety with high probability during the learning process, regardless of the RL algorithm used, and demonstrates greater policy exploration efficiency. We test our algorithm on (1) control of an inverted pendulum and (2) autonomous car-following with wireless vehicle-to-vehicle communication, and show that our algorithm attains much greater sample efficiency in learning than other state-of-the-art algorithms and maintains safety during the entire learning process.Comment: Published in AAAI 201

arXiv.org e-Print Archive

Caltech Authors

Association for the Advancement of Artificial Intelligence: AAAI Publications

Connected Cruise and Traffic Control for Pairs of Connected Automated Vehicles

Author: Guo Sicong
Molnar Tamas G.
Orosz Gabor
Publication venue
Publication date: 12/06/2023
Field of study

This paper considers mixed traffic consisting of connected automated vehicles equipped with vehicle-to-everything (V2X) connectivity and human-driven vehicles. A control strategy is proposed for communicating pairs of connected automated vehicles, where the two vehicles regulate their longitudinal motion by responding to each other, and, at the same time, stabilize the human-driven traffic between them. Stability analysis is conducted to find stabilizing controllers, and simulations are used to show the efficacy of the proposed approach. The impact of the penetration of connectivity and automation on the string stability of traffic is quantified. It is shown that, even with moderate penetration, connected automated vehicle pairs executing the proposed controllers achieve significant benefits compared to when these vehicles are disconnected and controlled independently.Comment: Accepted to the IEEE Transactions on Intelligent Transportation Systems. 11 pages, 10 figure

arXiv.org e-Print Archive

Control Regularization for Reduced Variance Reinforcement Learning

Author: Burdick Joel W.
Chaudhuri Swarat
Cheng Richard
Orosz Gabor
Verma Abhinav
Yue Yisong
Publication venue
Publication date: 13/05/2019
Field of study

Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a policy prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the policy prior has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.Comment: Appearing in ICML 201

arXiv.org e-Print Archive

Caltech Authors

On the Safety of Connected Cruise Control: Analysis and Synthesis with Control Barrier Functions

Author: Ames Aaron D.
Molnar Tamas G.
Orosz Gabor
Publication venue
Publication date: 31/08/2023
Field of study

Connected automated vehicles have shown great potential to improve the efficiency of transportation systems in terms of passenger comfort, fuel economy, stability of driving behavior and mitigation of traffic congestions. Yet, to deploy these vehicles and leverage their benefits, the underlying algorithms must ensure their safe operation. In this paper, we address the safety of connected cruise control strategies for longitudinal car following using control barrier function (CBF) theory. In particular, we consider various safety measures such as minimum distance, time headway and time to conflict, and provide a formal analysis of these measures through the lens of CBFs. Additionally, motivated by how stability charts facilitate stable controller design, we derive safety charts for existing connected cruise controllers to identify safe choices of controller parameters. Finally, we combine the analysis of safety measures and the corresponding stability charts to synthesize safety-critical connected cruise controllers using CBFs. We verify our theoretical results by numerical simulations.Comment: Accepted to the 62nd IEEE Conference on Decision and Control. 6 pages, 5 figure

arXiv.org e-Print Archive

Implication of the period-magnitude relation for massive AGB stars and its astronomical applications

Author: Kurayama Tomoharu
Nakagawa Akiharu
Orosz Gabor
Sudou Hiroshi
Publication venue
Publication date: 18/10/2023
Field of study

We present astrometric very long baseline interferometry (VLBI) studies of AGB stars. To understand the properties and evolution of AGB stars, distances are an important parameter. The distribution and kinematics of their circumstellar matter are also revealed with the VLBI method. We used the VERA array to observe 22\,GHz H

_2

O masers in various subclasses of AGB stars. Parallaxes of the three OH/IR stars NSV17351, OH39.7

+

1.5, IRC

-

30363, and the Mira-type variable star AW~Tau were newly obtained. We present the circumstellar distribution and kinematics of H

_2

O masers around NSV17351. The absolute magnitudes in mid-infrared bands of OH/IR stars with very long pulsation periods were investigated and a period-magnitude relation in the WISE W3 band,

M_{\mathrm{W3}} = (-7.21\pm1.18)\log P + (9.25\pm3.09)

, was found for the Galactic AGB stars. The VLBI is still a powerful tool for parallax measurements of the Galactic AGB stars surrounded by thick dust shells.Comment: 24 pages, 8 figures, Proceedings of the IAU symposium 376, At the cross-roads of astrophysics and cosmology Period luminosity relations in the 2020

arXiv.org e-Print Archive

Too many swipes for today: The Development of the Problematic Tinder Use Scale (PTUS)

Author: Bőthe Beáta
Melher Dora
Orosz Gabor
Tóth-Király István
Publication venue: 'Akademiai Kiado Zrt.'
Publication date: 01/01/2016
Field of study

Background and aims Tinder is a very popular smartphone-based geolocated dating application. The goal of the present study was creating a short Problematic Tinder Use Scale (PTUS). Methods Griffiths’ (2005) six-component model was implemented for covering all components of problematic Tinder use. Confirmatory factor analyses were carried out on a Tinder user sample (N = 430). Results Both the 12- and the 6-item versions were tested. The 6-item unidimensional structure has appropriate reliability and factor structure. No salient demography-related differences were found. Users irrespectively to their relationship status have similar scores on PTUS. Discussion Tinder users deserve the attention of scientific examination considering their large proportion among smartphone users. It is especially true considering the emerging trend of geolocated online dating applications. Conclusions Before PTUS, no prior scale has been created to measure problematic Tinder use. The PTUS is a suitable and reliable measure to assess problematic Tinder use

Crossref

PubMed Central

Repository of the Academy's Library

ELTE Digital Institutional Repository (EDIT)

Bow shocks in water fountain jets

Author: Burns Ross A.
Gómez José F.
Imai Hiroshi
Ngendo Ann Njeri
Orosz Gabor
Tafoya Daniel
Torrelles José M.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2017
Field of study

We briefly introduce the VLBI maser astrometric analysis of IRAS 18043-2116 and IRAS 18113-2503, two remarkable and unusual water fountains with spectacular bipolar bow shocks in their high-speed collimated jet-driven outflows. The 22 GHz H2O maser structures and velocities clearly show that the jets are formed in very short-lived, episodic outbursts, which may indicate episodic accretion in an underlying binary system.Comment: To appear in the proceedings of the IAU Symposium 336: Astrophysical Masers: Unlocking the Mysteries of the Universe (4-8 September 2017, Cagliari, Italy) - IAU Proceedings Series, eds. A. Tarchi, M. J. Reid, and P. Castangi

arXiv.org e-Print Archive

Chalmers Research

Coordination for Connected Automated Vehicles at Merging Roadways in Mixed Traffic Environment

Author: Le Viet-Anh
Malikopoulos Andreas A.
Orosz Gabor
Wang Hao M.
Publication venue
Publication date: 11/04/2023
Field of study

In this paper, we present a two-level optimal control framework to address motion coordination of connected automated vehicles (CAVs) in the presence of human-driven vehicles (HDVs) in merging scenarios. Our framework combines an unconstrained trajectory solution of a low-level energy-optimal control problem with an upper-level optimization problem that yields the minimum travel time for CAVs. We predict the future trajectories of the HDVs using Newell's car-following model. To handle potential deviations of HDVs' actual behavior from the one predicted, we provide a risk-triggered re-planning mechanism for the CAVs based on time-to-conflict. The effectiveness of the proposed control framework is demonstrated via simulations with heterogeneous human driving behaviors and via experiments in a scaled environment.Comment: first manuscript, 7 page

arXiv.org e-Print Archive